date:20250516

Re: Virtual generated columns

2025-05-16 Thread Richard Guo

On Fri, May 16, 2025 at 1:00 PM Alexander Lakhin  wrote:
> I've discovered yet another way to trigger that error:
> create table vt (a int, b int generated always as (a * 2), c int);
> insert into vt values(1);
> alter table vt alter column c type bigint using b + c;
>
> ERROR:  XX000: unexpected virtual generated column reference
> LOCATION:  CheckVarSlotCompatibility, execExprInterp.c:2410

Thank you for the report.  It seems that we fail to expand references
to virtual generated columns in the NewColumnValues expression when
altering tables.  We might be able to fix it by:

@@ -6203,7 +6203,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap)
NewColumnValue *ex = lfirst(l);

/* expr already planned */
-   ex->exprstate = ExecInitExpr((Expr *) ex->expr, NULL);
+   ex->exprstate = ExecInitExpr((Expr *)
expand_generated_columns_in_expr((Node *) ex->expr, oldrel, 1), NULL);

Thanks
Richard

Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

2025-05-16 Thread Xuneng Zhou

Hey,

I noticed a couple of small clarity issues in the current version of patch
for potential clean up:

1. Commit message wording

Right now it says:

“The pg_buffercache_mark_dirty_all() function provides an efficient way to
dirty the entire buffer pool (e.g., ~550 ms vs. ~70 ms for 16 GB of shared
buffers), complementing pg_buffercache_mark_dirty() for more granular
control.”


That makes it sound like the *_all* function is the granular one, when
really:

• pg_buffercache_mark_dirty(buffernumber) is the fine-grained, per-buffer
call.

• pg_buffercache_mark_dirty_all() is the bulk, coarse-grained operation.

How about rephrasing to:

“The pg_buffercache_mark_dirty_all() function provides an efficient, bulk
way to mark every buffer dirty (e.g., ~70 ms vs. ~550 ms for 16 GB of
shared buffers), while pg_buffercache_mark_dirty() still allows per-buffer,
granular control.”

2. Inline comment in MarkUnpinnedBufferDirty

We currently have:

PinBuffer_Locked(desc);  */* releases spinlock */*

Folks who’re unfamiliar with this function might get confused. Maybe we
could use the one in GetVictimBuffer:


*/* Pin the buffer and then release its spinlock */*

PinBuffer_Locked(buf_hdr);


That spelling-out makes it obvious what’s happening.


> Since that patch is targeted for the PG 19, pg_buffercache is bumped to
> v1.7.
>
> Latest version is attached and people who already reviewed the patches are
> CCed.
>
>

Re: Virtual generated columns

2025-05-16 Thread jian he

On Fri, May 16, 2025 at 3:26 PM Richard Guo  wrote:
>
> On Fri, May 16, 2025 at 1:00 PM Alexander Lakhin  wrote:
> > I've discovered yet another way to trigger that error:
> > create table vt (a int, b int generated always as (a * 2), c int);
> > insert into vt values(1);
> > alter table vt alter column c type bigint using b + c;
> >
> > ERROR:  XX000: unexpected virtual generated column reference
> > LOCATION:  CheckVarSlotCompatibility, execExprInterp.c:2410
>
> Thank you for the report.  It seems that we fail to expand references
> to virtual generated columns in the NewColumnValues expression when
> altering tables.  We might be able to fix it by:
>
> @@ -6203,7 +6203,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap)
> NewColumnValue *ex = lfirst(l);
>
> /* expr already planned */
> -   ex->exprstate = ExecInitExpr((Expr *) ex->expr, NULL);
> +   ex->exprstate = ExecInitExpr((Expr *)
> expand_generated_columns_in_expr((Node *) ex->expr, oldrel, 1), NULL);
>

we have used the USING expression in ATPrepAlterColumnType,
ATColumnChangeRequiresRewrite.
expanding it on ATPrepAlterColumnType seems to make more sense?

@@ -14467,7 +14467,7 @@ ATPrepAlterColumnType(List **wqueue,
 */
newval = (NewColumnValue *) palloc0(sizeof(NewColumnValue));
newval->attnum = attnum;
-   newval->expr = (Expr *) transform;
+   newval->expr = (Expr *)
expand_generated_columns_in_expr(transform, rel, 1);
newval->is_generated = false;

Re: Backward movement of confirmed_flush resulting in data duplication.

2025-05-16 Thread Nisha Moond

Hi,

On Tue, May 13, 2025 at 3:48 PM shveta malik  wrote:
>
>
> With the given script, the problem reproduces on Head and PG17. We are
> trying to reproduce the issue on PG16 and below where injection points
> are not there.
>

The issue can also be reproduced on PostgreSQL versions 13 through 16.

The same steps shared earlier in the
'reproduce_data_duplicate_without_twophase.sh' script can be used to
reproduce the issue on versions PG14 to PG16.

Since back branches do not support injection points, you can add infinite
loops at the locations where the patch
'v1-0001-Injection-points-to-reproduce-the-confirmed_flush.patch introduces
injection points'. These loops allow holding and releasing processes using
a debugger when needed.

Attached are detailed documents describing the reproduction steps:
 1) Use 'reproduce_steps_for_pg14_to_16.txt' for PG14 to PG16.
 2) Use 'reproduce_steps_for_pg13.txt' for PG13.

Note: PG13 uses temporary replication slots for tablesync workers, unlike
later versions that use permanent slots. Because of this difference, some
debugger-related steps differ slightly in PG13, which is why a separate
document is provided for it.

--
Thanks,
Nisha
Below are the detailed steps to follow to reproduce the issue on PG14 to PG16 
versions using debugger:
(Note: Since these steps are intended to be run manually, short delays like 
sleep 1 between steps are assumed and not explicitly mentioned. Any wait time 
longer than one second is explicitly called out.)

1. Set up the primary and subscriber nodes with the same configurations as 
shared in reproduce_data_duplicate_without_twophase.sh. (The script can be used 
to do the initial setup)

2. On Primary: Create table tab1, insert a value and create a publication

psql -d postgres -p $port_primary -c "CREATE TABLE tab1(a int); INSERT INTO 
tab1 VALUES(1); CREATE PUBLICATION pub FOR TABLE tab1;"

3. On Subscriber: Create the same table tab1

psql -d postgres -p $port_subscriber -c "CREATE TABLE tab1(a int);"

4. On Subscriber: Start the subscription with copy_data to false

psql -d postgres -p $port_subscriber -c "CREATE SUBSCRIPTION sub CONNECTION 
'dbname=postgres port=$port_primary' PUBLICATION pub WITH 
(slot_name='logicalslot', create_slot=true, copy_data = false, enabled=true)"

5. Primary: Confirm the slot details

psql -d postgres -p $port_primary -c "SELECT slot_name, restart_lsn, 
confirmed_flush_lsn FROM pg_replication_slots WHERE slot_name='logicalslot'"

6. Insert the data into tab1. The apply worker's origin_lsn and slot's 
confirmed_flush will be advanced to this INSERT lsn (say lsn1)

psql -d postgres -p $port_primary -c "INSERT INTO tab1 VALUES(2);"

7. Check both confirmed_flush and origin_lsn values, both values should now 
match the LSN of the insert above (lsn1).
psql -d postgres -p $port_subscriber -c "select * from 
pg_replication_origin_status where local_id = 1;"
psql -d postgres -p $port_primary -c "SELECT slot_name, restart_lsn, 
confirmed_flush_lsn FROM pg_replication_slots WHERE slot_name='logicalslot'"

8. Add a new table (tab2) to the publication on Primary
psql -d postgres -p $port_primary -c "CREATE TABLE tab2 (a int UNIQUE); ALTER 
PUBLICATION pub ADD TABLE tab2;"

9. Create tab2 on Subscriber
psql -d postgres -p $port_subscriber -c "CREATE TABLE tab2 (a int UNIQUE);"

10. Refresh the subscription. It will start tablesync for tab2.

psql -d postgres -p $port_subscriber -c "ALTER SUBSCRIPTION sub REFRESH 
PUBLICATION"

11. Attach debugger to the tablesync worker and hold it just before it sets the 
state to SUBREL_STATE_SYNCWAIT.

12. On Primary: Insert a row into tab2. Lets say the remote lsn for this change 
is lsn2.

psql -d postgres -p $port_primary -c "INSERT INTO tab2 VALUES(2);"

13. Wait for 3+ seconds. The above insert will not be consumed by tablesync 
worker on sub yet. Apply worker will see this change and will ignore it.

14. Check that confirmed_flush has moved to lsn2 now (where lsn2 > lsn1 ) due 
to keepalive message handling in apply worker. And origin_lsn remains unchanged.

psql -d postgres -p $port_subscriber -c "select * from 
pg_replication_origin_status where local_id = 1;"

psql -d postgres -p $port_primary -c "SELECT slot_name, restart_lsn, 
confirmed_flush_lsn FROM pg_replication_slots WHERE slot_name='logicalslot'"

15. Attach another debugger to apply-worker process and hold it just before 
maybe_reread_subscription() call.

16. Release the tablesync worker from debugger
 - It will now move to SUBREL_STATE_SYNCWAIT state and will wait for apply 
worker to move to SUBREL_STATE_CATCHUP. 

17. Disable the sub
psql -d postgres -p $port_subscriber -c "alter subscription sub disable;"

18. Release the apply worker from debugger.
 - It will re-read subscription and move the state to SUBREL_STATE_CATCHUP. 
Then will exit due to sub being disabled. Tablesync will also exit here.

19. Enable the subscription again and let the apply worker start.
 - Tablesync now catchup

wrong query results on bf leafhopper

2025-05-16 Thread Andres Freund

Hi,

I noticed this recent BF failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2025-05-15%2008%3A10%3A04

=== dumping 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/regression.diffs
 ===
diff -U3 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/regress/expected/memoize.out
 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/results/memoize.out
--- 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/regress/expected/memoize.out
   2025-05-15 08:10:04.211926695 +
+++ 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/results/memoize.out
 2025-05-15 08:18:29.117733601 +
@@ -42,7 +42,7 @@
->  Nested Loop (actual rows=1000.00 loops=N)
  ->  Seq Scan on tenk1 t2 (actual rows=1000.00 loops=N)
Filter: (unique1 < 1000)
-   Rows Removed by Filter: 9000
+   Rows Removed by Filter: 8982
  ->  Memoize (actual rows=1.00 loops=N)
Cache Key: t2.twenty
Cache Mode: logical
@@ -178,7 +178,7 @@
->  Nested Loop (actual rows=1000.00 loops=N)
  ->  Seq Scan on tenk1 t1 (actual rows=1000.00 loops=N)
Filter: (unique1 < 1000)
-   Rows Removed by Filter: 9000
+   Rows Removed by Filter: 8981
  ->  Memoize (actual rows=1.00 loops=N)
Cache Key: t1.two, t1.twenty
Cache Mode: binary


For a moment I thought this could be a bug in memoize, but that doesn't
actually make sense - the failure isn't in memoize, it's the seqscan.

Subsequently I got worried that this is an AIO bug or such causing wrong query
results. But there are instances of this error well before AIO was
merged. E.g.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2024-12-18%2023%3A35%3A04

The same error is also present down to 16.

In 15, I saw a potentially related error
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2024-12-16%2023%3A43%3A03


There have been other odd things on leafhopper, see e.g.:
https://www.postgresql.org/message-id/35d87371-f3ab-42c8-9aac-bb39ab5bd987%40gmail.com
https://postgr.es/m/Z4npAKvchWzKfb_r%40paquier.xyz

Greetings,

Andres Freund

C extension compilation failed while using PG 17.2 on mac m1

2025-05-16 Thread Lakshmi Narayana Velayudam

Hello, while compiling my c extension with PG 17,2 I am getting

postgresql-17.2/pgsql/include/server/port/pg_iovec.h:93:10: error: call to
undeclared function 'pwritev'; ISO C99 and later do not support implicit
function declarations [-Wimplicit-function-declaration]

   93 | return pwritev(fd, iov, iovcnt, offset);


I am not using pg_iovec.h or pwrite/pread related functions. My extension
compiles fine without any issues on PG 14.3

*PG compilation commands:*
./configure --prefix=$PWD/pgsql --enable-cassert --enable-debug
CFLAGS="-ggdb -fno-omit-frame-pointer -O0" --without-icu
make clean
make -j
make install -j

*Mac OS:* 15.1.1

Please let me know if you need any other information.

Thanks & Regards,
Narayana

Re: Align wording on copyright organization

2025-05-16 Thread Dave Page

On Fri, 16 May 2025 at 09:12, Daniel Gustafsson  wrote:

> As was briefly discussed in the developers meeting, and the hallway track,
> at
> PGConf.dev we have a few variations on the organization wording which
> really
> should be aligned with the wording defined by -core in pgweb commit
> 2d764dbc08.
>

Thanks Daniel - the patch looks good to me.

You'll push this to all active branches, correct?

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
pgEdge: https://www.pgedge.com

Re: Proposal: Make cfbot fail on patches not created by "git format-patch"

2025-05-16 Thread Jacob Champion

On Fri, May 16, 2025 at 12:12 PM Tom Lane  wrote:
> That outcome seems entirely horrible to me.  If you want to flag the lack
> of a commit message somehow, fine, but don't prevent CI from running.

Personally I also prefer nudges to gates. Just like people already
deprioritize "Waiting on Author" entries a bit, having an obvious
"Patch Needs Work" note might gently help newcomers iterate on their
first submissions (or even communicate where a patch is in the
lifecycle! e.g. a Bugfix entry where the patch is marked incomplete
might motivate someone to jump in to fix it).

--Jacob

Re: Make wal_receiver_timeout configurable per subscription

2025-05-16 Thread Srinath Reddy Sadipiralla

On Fri, May 16, 2025 at 9:11 PM Fujii Masao 
wrote:

> Hi,
>
> When multiple subscribers connect to different publisher servers,
> it can be useful to set different wal_receiver_timeout values for
> each connection to better detect failures. However, this isn't
> currently possible, which limits flexibility in managing subscriptions.
>
>
Hi,+1 for the idea.


>
> One approach is to add wal_receiver_timeout as a parameter to
> CREATE SUBSCRIPTION command, storing it in pg_subscription
> so each logical replication worker can use its specific value.
>
> Another option is to change the wal_receiver_timeout's GUC context
> from PGC_SIGHUP to PGC_USERSET. This would allow setting different
> values via ALTER ROLE SET command for each subscription owner -
> effectively enabling per-subscription configuration. Since this
> approach is simpler and likely sufficient, I'd prefer starting with this.
> Thought?


Both ways LGTM,for starters we can go with changing GUC's context.


> BTW, this could be extended in the future to other GUCs used by
> logical replication workers, such as wal_retrieve_retry_interval.
>
>
+1 for extending this idea for other GUCs as well.

-- 
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

Re: Statistics Import and Export

2025-05-16 Thread Hari Krishna Sunder

Gentle ping on this.

---
Hari Krishna Sunder

On Wed, May 14, 2025 at 1:30 PM Hari Krishna Sunder 
wrote:

> Thanks Nathan.
> Here is the patch with a comment.
>
> On Wed, May 14, 2025 at 8:53 AM Nathan Bossart 
> wrote:
>
>> On Tue, May 13, 2025 at 05:01:02PM -0700, Hari Krishna Sunder wrote:
>> > We found a minor issue when testing statistics import with upgrading
>> from
>> > versions older than v14. (We have VACUUM and ANALYZE disabled)
>> > 3d351d916b20534f973eda760cde17d96545d4c4
>> > <
>> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3d351d916b20534f973eda760cde17d96545d4c4
>> >
>> > changed
>> > the default value for reltuples from 0 to -1. So when such tables are
>> > imported they get the pg13 default of 0 which in pg18 is treated
>> > as "vacuumed and seen to be empty" instead of "never yet vacuumed". The
>> > planner then proceeds to pick seq scans even if there are indexes for
>> these
>> > tables.
>> > This is a very narrow edge case and the next VACUUM or ANALYZE will fix
>> it
>> > but the perf of these tables immediately after the upgrade is
>> considerably
>> > affected.
>>
>> There was a similar report for vacuumdb's new --missing-stats-only option.
>> We fixed that in commit 9879105 by removing the check for reltuples != 0,
>> which means that --missing-stats-only will process empty tables.
>>
>> > Can we instead use -1 if the version is older than 14, and reltuples is
>> 0?
>> > This will have the unintended consequence of treating a truly empty
>> table
>> > as "never yet vacuumed", but that should be fine as empty tables are
>> going
>> > to be fast regardless of the plan picked.
>>
>> I'm inclined to agree that we should do this.  Even if it's much more
>> likely that 0 means empty versus not-yet-processed, the one-time cost of
>> processing some empty tables doesn't sound too bad.  In any case, since
>> this only applies to upgrades from > over time.
>>
>> > PS: This is my first patch, so apologies for any issues with the patch.
>>
>> It needs a comment, but otherwise it looks generally reasonable to me
>> after
>> a quick glance.
>>
>> --
>> nathan
>>
>

Re: PG 17.2 compilation fails with -std=c11 on mac

2025-05-16 Thread Tom Lane

I wrote:
> We lack that #define, which results in  not supplying
> the declaration.  AFAICT the requirement for this is in the C11
> standard, this is not just Apple doing something weird.
> Aside from this compilation error, we're probably failing to use
> memset_s on some platforms where it's available, for lack of
> the #define in our configure/meson probes.  I know how to fix
> that in configure, but I'm less clear on how nonstandard probes
> work in meson --- any help there?

Here's a lightly-tested fix for that (against HEAD, but it likely
works on v17 too).

regards, tom lane

diff --git a/configure b/configure
index 275c67ee67c..4f15347cc95 100755
--- a/configure
+++ b/configure
@@ -15616,7 +15616,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in backtrace_symbols copyfile copy_file_range elf_aux_info getauxval getifaddrs getpeerucred inet_pton kqueue localeconv_l mbstowcs_l memset_s posix_fallocate ppoll pthread_is_threaded_np setproctitle setproctitle_fast strsignal syncfs sync_file_range uselocale wcstombs_l
+for ac_func in backtrace_symbols copyfile copy_file_range elf_aux_info getauxval getifaddrs getpeerucred inet_pton kqueue localeconv_l mbstowcs_l posix_fallocate ppoll pthread_is_threaded_np setproctitle setproctitle_fast strsignal syncfs sync_file_range uselocale wcstombs_l
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
@@ -16192,6 +16192,19 @@ cat >>confdefs.h <<_ACEOF
 #define HAVE_DECL_STRCHRNUL $ac_have_decl
 _ACEOF
 
+ac_fn_c_check_decl "$LINENO" "memset_s" "ac_cv_have_decl_memset_s" "#define __STDC_WANT_LIB_EXT1__ 1
+#include 
+"
+if test "x$ac_cv_have_decl_memset_s" = xyes; then :
+  ac_have_decl=1
+else
+  ac_have_decl=0
+fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_DECL_MEMSET_S $ac_have_decl
+_ACEOF
+
 
 # This is probably only present on macOS, but may as well check always
 ac_fn_c_check_decl "$LINENO" "F_FULLFSYNC" "ac_cv_have_decl_F_FULLFSYNC" "#include 
diff --git a/configure.ac b/configure.ac
index 7ea91d56adb..4b8335dc613 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1792,7 +1792,6 @@ AC_CHECK_FUNCS(m4_normalize([
 	kqueue
 	localeconv_l
 	mbstowcs_l
-	memset_s
 	posix_fallocate
 	ppoll
 	pthread_is_threaded_np
@@ -1838,6 +1837,8 @@ AC_CHECK_DECLS([strlcat, strlcpy, strnlen, strsep, timingsafe_bcmp])
 AC_CHECK_DECLS([preadv], [], [], [#include ])
 AC_CHECK_DECLS([pwritev], [], [], [#include ])
 AC_CHECK_DECLS([strchrnul], [], [], [#include ])
+AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
+#include ])
 
 # This is probably only present on macOS, but may as well check always
 AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include ])
diff --git a/meson.build b/meson.build
index 12de5e80c31..d142e3e408b 100644
--- a/meson.build
+++ b/meson.build
@@ -2654,6 +2654,7 @@ decl_checks += [
   ['preadv', 'sys/uio.h'],
   ['pwritev', 'sys/uio.h'],
   ['strchrnul', 'string.h'],
+  ['memset_s', 'string.h', '#define __STDC_WANT_LIB_EXT1__ 1'],
 ]
 
 # Check presence of some optional LLVM functions.
@@ -2667,21 +2668,23 @@ endif
 foreach c : decl_checks
   func = c.get(0)
   header = c.get(1)
-  args = c.get(2, {})
+  prologue = c.get(2, '')
+  args = c.get(3, {})
   varname = 'HAVE_DECL_' + func.underscorify().to_upper()
 
   found = cc.compiles('''
-#include <@0@>
+@0@
+#include <@1@>
 
 int main()
 {
-#ifndef @1@
-(void) @1@;
+#ifndef @2@
+(void) @2@;
 #endif
 
 return 0;
 }
-'''.format(header, func),
+'''.format(prologue, header, func),
 name: 'test whether @0@ is declared'.format(func),
 # need to add cflags_warn to get at least
 # -Werror=unguarded-availability-new if applicable
@@ -2880,7 +2883,6 @@ func_checks = [
   ['kqueue'],
   ['localeconv_l'],
   ['mbstowcs_l'],
-  ['memset_s'],
   ['mkdtemp'],
   ['posix_fadvise'],
   ['posix_fallocate'],
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c3cc9fa856d..726a7c1be1f 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -91,6 +91,10 @@
`LLVMCreatePerfJITEventListener', and to 0 if you don't. */
 #undef HAVE_DECL_LLVMCREATEPERFJITEVENTLISTENER
 
+/* Define to 1 if you have the declaration of `memset_s', and to 0 if you
+   don't. */
+#undef HAVE_DECL_MEMSET_S
+
 /* Define to 1 if you have the declaration of `posix_fadvise', and to 0 if you
don't. */
 #undef HAVE_DECL_POSIX_FADVISE
@@ -291,9 +295,6 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_MEMORY_H
 
-/* Define to 1 if you have the `memset_s' function. */
-#undef HAVE_MEMSET_S
-
 /* Define to 1 if you have the `mkdtemp' function. */
 #undef HAVE_MKDTEMP
 
diff --git a/src/port/explicit_bzero.c b/src/port/explicit_bzero.c
index 1d37b119bab..358a692f357 100644
--- a/src/port/explicit_bzero.c
+++ b/src/port/explicit_bzero.c
@@ -12,9 +12,11 @@
  *---

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Vik Fearing




On 16/05/2025 15:01, Tom Lane wrote:

Aleksander Alekseev  writes:

If I'm right about the limitations of aggregate functions and SRFs
this leaves us the following options:
1. Changing the constraints of aggregate functions or SRFs. However I
don't think we want to do it for such a single niche scenario.
2. Custom syntax and a custom node.
3. To give up

Seems to me the obvious answer is to extend TABLESAMPLE (or at least, some
of the tablesample methods) to allow it to work on a subquery.



Isn't this a job for ?


Example:

SELECT ...
FROM ... JOIN ...
FETCH SAMPLE FIRST 10 ROWS ONLY


Then the nodeLimit could do some sort of reservoir sampling.


There are several enhancements to  coming down the 
pipe, this could be one of them.


--

Vik Fearing

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Tom Lane

Vik Fearing  writes:
> On 16/05/2025 15:01, Tom Lane wrote:
>> Seems to me the obvious answer is to extend TABLESAMPLE (or at least, some
>> of the tablesample methods) to allow it to work on a subquery.

> Isn't this a job for ?
> FETCH SAMPLE FIRST 10 ROWS ONLY

How is that an improvement on TABLESAMPLE?  Or did the committee
forget that they already have that feature?

TABLESAMPLE seems strictly better to me here because it affords
the opportunity to specify one of several methods, which seems
like it would be useful in this context.

regards, tom lane

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Vik Fearing




On 16/05/2025 23:21, Tom Lane wrote:

Vik Fearing  writes:

On 16/05/2025 15:01, Tom Lane wrote:

Seems to me the obvious answer is to extend TABLESAMPLE (or at least, some
of the tablesample methods) to allow it to work on a subquery.

Isn't this a job for ?
FETCH SAMPLE FIRST 10 ROWS ONLY

How is that an improvement on TABLESAMPLE?  Or did the committee
forget that they already have that feature?

TABLESAMPLE seems strictly better to me here because it affords
the opportunity to specify one of several methods, which seems
like it would be useful in this context.



TABLESAMPLE is hitched to a  which can be basically 
anything resembling a relation.  So it appears the standard already 
allows this and we just need to implement it.


--

Vik Fearing

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Nico Williams

On Fri, May 16, 2025 at 09:01:50AM -0400, Tom Lane wrote:
> Seems to me the obvious answer is to extend TABLESAMPLE (or at least, some
> of the tablesample methods) to allow it to work on a subquery.

The key here is that we need one bit of state between rows: the count of
rows seen so far.

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Nico Williams

On Fri, May 16, 2025 at 11:10:49PM +0200, Vik Fearing wrote:
> Isn't this a job for ?
> 
> Example:
> 
> SELECT ...
> FROM ... JOIN ...
> FETCH SAMPLE FIRST 10 ROWS ONLY
> 
> Then the nodeLimit could do some sort of reservoir sampling.

The query might return fewer than N rows.  What reservoir sampling
requires is this bit of state: the count of input rows so far.

The only way I know of to keep such state in a SQL query is with a
RECURSIVE CTE, but unfortunately that would require unbounded CTE size,
and it would require a way to query next rows one per-iteration.

Nico
--

Foreign key isolation tests

2025-05-16 Thread Paul A Jungwirth

Here are a couple new isolation tests for foreign keys, based on
feedback from the Advanced Patch Review session at PGConf.dev.

The goal is to show that temporal foreign keys do not violate
referential integrity under concurrency. Non-temporal foreign keys use
a crosscheck snapshot to avoid this. Temporal foreign keys do the
same, but this test gives us more assurance.

I noticed that traditional foreign keys didn't have a test either, at
least for the case we discussed: one transaction INSERTS a referencing
row, while the other DELETEs the referenced row. So I have two patches
here: one adding tests for the traditional case; another, the temporal
case.

There is a final patch improving the comment for some snapshot macros.
While fixing a typo I thought I could improve their clarity a bit as
well.

Yours,

-- 
Paul  ~{:-)
p...@illuminatedcomputing.com


v1-0003-Improve-comment-about-snapshot-macros.patch
Description: Binary data


v1-0001-Fill-testing-gap-for-possible-referential-integri.patch
Description: Binary data


v1-0002-Add-test-for-temporal-referential-integrity.patch
Description: Binary data

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Nico Williams

On Thu, May 15, 2025 at 02:41:15AM +0300, Aleksander Alekseev wrote:
> SELECT t.ts, t.city, t.temperature, h.humidity
> FROM temperature AS t
> LEFT JOIN LATERAL
>   ( SELECT * FROM humidity
> WHERE city = t.city AND ts <= t.ts
> ORDER BY ts DESC LIMIT 1
>   ) AS h ON TRUE
> WHERE t.ts < '2022-01-05';
> ```
> 
> One can do `SELECT (the query above) ORDER BY random() LIMIT x` but
> this produces an inefficient plan. Alternatively one could create

I assume it gathers the whole result and _then_ orders by random(), yes?

I think the obvious thing to do is to have the optimizer recognize
`ORDER BY random() LIMIT x` and do the reservoir sampling thing as the
underlying query produces rows.  The engine needs to keep track of how
many rows it's seen so far so that with each new row it can use
`random() < 1/n` to decide whether to keep the row.

> temporary tables using `CREATE TEMP TABLE ... AS SELECT * FROM tbl
> TABLESAMPLE BERNOULLI(20)` but this is inconvenient and would be
> suboptimal even if we supported global temporary tables.

Speaking of which, what would it take to have global temp tables?

> 1. Do you think there might be value in addressing this issue?

Yes!  I think `ORDER BY random() LIMIT x` is succinct syntax, and
semantically it should yield as uniformly distributed a result set as
possible even when the size of the underlying query is not knowable a
priori.  And it should be as optimal as possible considering that the
underlying query's result set is in principle -and many useful real use
cases- unbounded.

And reservoir sampling is a suitable existing technique to get an
optimal and uniform random finite sampling of an indefinite size row
set.

> 2. If yes, how would you suggest addressing it from the UI point of
> view - by adding a special syntax, some sort of aggregate function, or
> ...?

`ORDER BY random() LIMIT x` is a) in the standard, b) semantically what
you're asking for, c) succint syntax, d) implementable optimally,
therefore IMO `ORDER BY random() LIMIT x` is the correct syntax.

Nico
--

pg_upgrade ability to create extension from scripts

2025-05-16 Thread Regina Obe

It's my understanding that right now when you run pg_upgrade it creates the
extension from what exists in the to be upgraded databases.

Is there a reason why we can't have some sort of switch option that allows
the CREATE EXTENSION from the scripts instead of what is loaded in the db.

The reason I'm asking is for a couple of reasons.

1) For projects that version their libraries, sometimes the old versioned
library does not exist in the new install and not available for the new
install.
PostGIS project largely satisfied this issue by major versioning our
extension, but of cause this would still be an issue for folks going from
PostGIS 2.5 to PostGIS 3+.

2) Yes I know this is a no-no but the postgis spatial_ref_sys table is
assumed to be an essentially static table, and people build table
constraints using ST_Transform.
Also as we observed, for some reason the postgis.spatial_ref_sys table is
sometimes not created before user data is loaded or is empty and loaded
after user data.

As discussed here -  https://trac.osgeo.org/postgis/ticket/5899  

3) People often forget to run upgrade extension scripts after doing a
pg_upgrade so are left in an incomplete upgrade state after upgrade.
Having the extension be the latest version would largely solve this.
There are other issues with run into in the past where a users old extension
can not be upgraded cleanly, we've had this recently with BRIN.

Note I'm not asking to change the default behavior, just wondering if there
are obstacles to giving users an option to install from the scripts instead
of what's currently in the database.

Thanks,
Regina

Re: Proposal: Make cfbot fail on patches not created by "git format-patch"

2025-05-16 Thread Jelte Fennema-Nio

On Fri, 16 May 2025 at 12:24, Jacob Champion
 wrote:
>
> On Fri, May 16, 2025 at 12:12 PM Tom Lane  wrote:
> > That outcome seems entirely horrible to me.  If you want to flag the lack
> > of a commit message somehow, fine, but don't prevent CI from running.
>
> Personally I also prefer nudges to gates.

Okay, reasonable feedback. How about we add a CI job that does a
"quality check". That's much less strong, as all the other tests will
still run, but people would get a failing CI job which tells them that
something is wrong. We could also include a pgindent in that CI job.

Re: Proposal: Make cfbot fail on patches not created by "git format-patch"

2025-05-16 Thread Tom Lane

Jelte Fennema-Nio  writes:
> Okay, reasonable feedback. How about we add a CI job that does a
> "quality check". That's much less strong, as all the other tests will
> still run, but people would get a failing CI job which tells them that
> something is wrong. We could also include a pgindent in that CI job.

WFM

regards, tom lane

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-16 Thread Shaik Mohammad Mujeeb

Hi Ilia Evdokimov,

While it's true that no arithmetic or logical operations are performed on 
queryid after the assignment, the overflow technically occurs at the point of 
assignment itself. For example, entry->key.queryid holds the value 
12747288675711951805 as a uint64, but after assigning it to queryid (which is 
an int64), it becomes -5699455397997599811 due to overflow.

This conversion is intentional - most likely to match the bigint type of the 
queryid column in pg_stat_statements. However, without an explicit comment, 
this can be misleading. A beginner reading this might misinterpret it as an 
unintentional overflow or bug and raise unnecessary concerns. Therefore, it’s 
worth adding a brief comment clarifying the intent behind this assignment.



Thanks & Regards,
Shaik Mohammad Mujeeb
Member Technical Staff

Zoho Corp












 On Fri, 16 May 2025 15:12:41 +0530 Ilia Evdokimov 
 wrote ---





On 15.05.2025 10:08, Shaik Mohammad
  Mujeeb wrote:





I don't think the comment is necessary here. There are no
  arithmetic or logical operations performed on it. It is only
  passed as a Datum.

--
 Best regards,
 Ilia Evdokimov,
 Tantor Labs LLC.


Hi Developers,
 
 In pg_stat_statements.c, the function pg_stat_statements_internal() declares 
the queryid variable as int64, but
  assigns it a value of type uint64. At first glance,
  this might appear to be an overflow issue. However, I think
  this is intentional - the queryid is cast to int64 to
match the bigint type of the queryid column in the
pg_stat_statements view.
 
 Please find the attached patch, which adds a clarifying
  comment to make the rationale explicit and avoid potential
  confusion for future readers.
 
 
 

Thanks and Regards,
 Shaik Mohammad Mujeeb
 Member Technical Staff
 Zoho Corp

Re: Proposal: Make cfbot fail on patches not created by "git format-patch"

2025-05-16 Thread Jelte Fennema-Nio

On Fri, 16 May 2025 at 12:05, Daniel Gustafsson  wrote:
>
> > On 16 May 2025, at 11:52, Jelte Fennema-Nio  wrote:
>
> > Does anyone have strong opposition to this? To be clear, it means we don't 
> > run CI on patches created by  piping "git diff" to a file anymore, as a way 
> > to nudge submitters into providing useful commit messages.
>
> Disclaimer: I wasn't in the session (due to conflicting interesting sessions)
> so I might be raising points/questions already answered.
>
> Is this really lowering the bar for new contributors?  I've always held "be
> liberal in what you accept" as a gold standard for projects I'm involved in, 
> to
> remove barriers to entry.  Good commit messages are obviously very important,
> but having your patch rejected (yes, I know, failing to apply) might not be
> strongest motivator for achieving this.

Lowering the bar for new contributors wasn't the purpose of this
change in policy. It's meant to reduce the work that committers and
reviewers have to do, which then in turn would result in quicker
reviews/commits. In my experience with other open source projects new
contributors are usually fine with adhering to project standards, if
they are told what those standards are. e.g. these days basically
every popular open source project is running a CI job that fails if
the auto-formatter fails.

Re: Conflict detection for update_deleted in logical replication

2025-05-16 Thread Amit Kapila

On Fri, May 16, 2025 at 12:01 PM shveta malik  wrote:
>
> On Fri, May 16, 2025 at 11:15 AM Amit Kapila  wrote:
> >
> >
> > BTW, another related point is that when we decide to stop retaining
> > dead tuples (via should_stop_conflict_info_retention), should we also
> > consider the case that the apply worker didn't even try to get the
> > publisher status because previously it decided that
> > oldest_nonremovable_xid cannot be advanced due to its
> > OldestActiveTransactionId?
> >
>
> Do you mean avoid  stop-conflict-retention in such a case as apply
> worker itself did not request status from the publisher? If I
> understood your point correctly, then we can do that by advancing the
> timer to a new value even if we did not update candidate-xid and did
> not ask the status from the publisher.
>

But candidate_xid_time is also used in wait_for_local_flush() to check
clock_skew between publisher and subscriber, so for that purpose, it
is better to set it along with candidate_xid. However, can't we rely
on the valid value of candidate_xid to ensure that apply worker didn't
send any request? Note that we always reset candidate_xid once we have
updated oldest_nonremovable_xid.

-- 
With Regards,
Amit Kapila.

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-16 Thread Ilia Evdokimov



On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:

Hi Developers,

In pg_stat_statements.c, the function /pg_stat_statements_internal()/ 
declares the /queryid/ variable as *int64*, but assigns it a value of 
type *uint64*. At first glance, this might appear to be an overflow 
issue. However, I think this is intentional - the queryid is cast to 
/int64/ *to match the bigint type of the queryid column in the 
pg_stat_statements view*.


Please find the attached patch, which adds a clarifying comment to 
make the rationale explicit and avoid potential confusion for future 
readers.




Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp




I don't think the comment is necessary here. There are no arithmetic or 
logical operations performed on it. It is only passed as a Datum.


--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

PG 17.2 compilation fails with -std=c11 on mac

2025-05-16 Thread Lakshmi Narayana Velayudam

Hello,

When I trying to compiling postgres 17.2 with -std=c11 I am getting the
below error on mac

explicit_bzero.c:22:9: error: call to undeclared function 'memset_s'; ISO
C99 and later do not support implicit function declarations
[-Wimplicit-function-declaration]

   22 | (void) memset_s(buf, len, 0, len);

*PG compilation commands:*
./configure --prefix=$PWD/pgsql --enable-cassert --enable-debug
CFLAGS="-ggdb -fno-omit-frame-pointer -O0 -std=c11" --without-icu
make -s clean
make -s -j
make install -j

*Mac OS:* 15.1.1

*Temporary fix: *without *-std=c11 *or replacing *memset_s* with *memset*
in *explicit_bzero* is working

*The same issue was reported in 16.2:*
https://www.postgresql.org/message-id/flat/CAJ4xhPmqZuYb2ydsKqkfm7wSnmVMD2cqQ%2By51qhTWkb-SfVG-w%40mail.gmail.com

Thanks & Regards
Narayana

Re: Align wording on copyright organization

2025-05-16 Thread Tom Lane

Dave Page  writes:
> On Fri, 16 May 2025 at 09:12, Daniel Gustafsson  wrote:
>> As was briefly discussed in the developers meeting, and the hallway track,
>> at
>> PGConf.dev we have a few variations on the organization wording which
>> really
>> should be aligned with the wording defined by -core in pgweb commit
>> 2d764dbc08.

> Thanks Daniel - the patch looks good to me.
> You'll push this to all active branches, correct?

+1 --- thanks Daniel, I had this on my to-do list also.

regards, tom lane

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Tom Lane

Aleksander Alekseev  writes:
> If I'm right about the limitations of aggregate functions and SRFs
> this leaves us the following options:

> 1. Changing the constraints of aggregate functions or SRFs. However I
> don't think we want to do it for such a single niche scenario.
> 2. Custom syntax and a custom node.
> 3. To give up

Seems to me the obvious answer is to extend TABLESAMPLE (or at least, some
of the tablesample methods) to allow it to work on a subquery.

regards, tom lane

Align wording on copyright organization

2025-05-16 Thread Daniel Gustafsson

As was briefly discussed in the developers meeting, and the hallway track, at
PGConf.dev we have a few variations on the organization wording which really
should be aligned with the wording defined by -core in pgweb commit 2d764dbc08.

--
Daniel Gustafsson



v1-0001-Align-organization-wording-in-copyright-statement.patch
Description: Binary data

Document default values for pgoutput options + fix missing initialization for "origin"

2025-05-16 Thread Fujii Masao


Hi,

The pgoutput plugin options are documented in the logical streaming
replication protocol, but their default values are not mentioned.
This can be inconvenient for users - for example, when using pg_recvlogical
with pgoutput plugin and needing to know the default behavior of each option.
https://www.postgresql.org/docs/devel/protocol-logical-replication.html

I'd like to propose adding the default values to the documentation to
improve clarity and usability. Patch attached (0001 patch).

While working on this, I also noticed that although most optional parameters
(like "binary") are explicitly initialized in parse_output_parameters(),
the "origin" parameter is not. Its value (PGOutputData->publish_no_origin)
is implicitly set to false due to zero-initialization, but unlike other
parameters, it lacks an explicit default assignment.

To ensure consistency and make the behavior clearer, I propose explicitly
initializing the "origin" parameter as well. A patch for this is also attached
(0002 patch).

Thoughts?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From 60b83897c786c89a1060d9c41adced566b7189b5 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Fri, 16 May 2025 23:09:55 +0900
Subject: [PATCH v1 1/2] doc: Document default values for pgoutput options in
 protocol.sgml.

The pgoutput plugin options are described in the logical streaming
replication protocol documentation, but their default values were
previously not mentioned. This made it less convenient for users,
for example, when specifying those options to use pg_recvlogical
with pgoutput plugin.

This commit adds the explanations of the default values for pgoutput
options to improve clarity and usability.
---
 doc/src/sgml/protocol.sgml | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index c4d3853cbf2..b9fac360fbf 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -3482,6 +3482,7 @@ psql "dbname=postgres replication=database" -c 
"IDENTIFY_SYSTEM;"
   
Boolean option to use binary transfer mode.  Binary mode is faster
than the text mode but slightly less robust.
+   The default is false.
   
  
 
@@ -3494,6 +3495,7 @@ psql "dbname=postgres replication=database" -c 
"IDENTIFY_SYSTEM;"
   
Boolean option to enable sending the messages that are written
by pg_logical_emit_message.
+   The default is false.
   
  
 
@@ -3505,10 +3507,11 @@ psql "dbname=postgres replication=database" -c 
"IDENTIFY_SYSTEM;"
  
   
Boolean option to enable streaming of in-progress transactions.
-   It accepts an additional value "parallel" to enable sending extra
+   It accepts an additional value parallel to enable 
sending extra
information with some messages to be used for parallelisation.
Minimum protocol version 2 is required to turn it on.  Minimum protocol
-   version 4 is required for the "parallel" option.
+   version 4 is required for the parallel option.
+   The default is false.
   
  
 
@@ -3521,6 +3524,7 @@ psql "dbname=postgres replication=database" -c 
"IDENTIFY_SYSTEM;"
   
Boolean option to enable two-phase transactions.   Minimum protocol
version 3 is required to turn it on.
+   The default is false.
   
  
 
@@ -3537,6 +3541,7 @@ psql "dbname=postgres replication=database" -c 
"IDENTIFY_SYSTEM;"
to send the changes regardless of their origin.  This can be used
to avoid loops (infinite replication of the same data) among
replication nodes.
+   The default is any.
   
  
 
-- 
2.49.0

From 762396c608892a90a6943f9f8bfa017d289075a9 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Fri, 16 May 2025 23:53:08 +0900
Subject: [PATCH v1 2/2] pgoutput: Initialize missing default for "origin"
 parameter.

The pgoutput plugin initializes optional parameters like "binary" with
default values at the start of processing. However, the "origin"
parameter was previously missed and left without explicit initialization.

Although the PGOutputData struct, which holds these settings,
is zero-initialized at allocation (resulting in publish_no_origin field
for "origin" parameter being false by default), this default was not
set explicitly, unlike other parameters.

This commit adds explicit initialization of the "origin" parameter to
ensure consistency and clarity in how defaults are handled.
---
 src/backend/replication/pgoutput/pgoutput.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/backend/replication/pgoutput/pgoutput.c 
b/src/backend/replication/pgoutput/pgoutput.c
index 693a766e6d7..d7099c060d9 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -297,10 +297,12 @@ parse_output_parameters(List *options, PGOutput

Re: Add comment explaining why queryid is int64 in pg_stat_statements

2025-05-16 Thread wenhui qiu

Hi Shaik
> While it's true that no arithmetic or logical operations are performed on
queryid after the assignment, the overflow technically > occurs at the
point of assignment itself. For example, *entry->key.queryid* holds the
value *12747288675711951805* as a
> *uint64*, but after assigning it to *queryid* (which is an *int64*), it
becomes *-5699455397997599811* *due to overflow*.

> This conversion is intentional - most likely to match the *bigint* type
of the *queryid* column in *pg_stat_statements*. However,
> without an explicit comment, this can be misleading. A beginner reading
this might misinterpret it as an unintentional overflow > or bug and raise
unnecessary concerns. Therefore, it’s worth adding a brief comment
clarifying the intent behind this
> assignment.
+1 agree

On Sat, May 17, 2025 at 1:54 AM Shaik Mohammad Mujeeb <
mujeeb...@zohocorp.com> wrote:

> Hi Ilia Evdokimov,
>
> While it's true that no arithmetic or logical operations are performed on
> queryid after the assignment, the overflow technically occurs at the
> point of assignment itself. For example, *entry->key.queryid* holds the
> value *12747288675711951805* as a *uint64*, but after assigning it to
> *queryid* (which is an *int64*), it becomes *-5699455397997599811* *due
> to overflow*.
>
> This conversion is intentional - most likely to match the *bigint* type
> of the *queryid* column in *pg_stat_statements*. However, without an
> explicit comment, this can be misleading. A beginner reading this might
> misinterpret it as an unintentional overflow or bug and raise unnecessary
> concerns. Therefore, it’s worth adding a brief comment clarifying the
> intent behind this assignment.
>
>
>
> Thanks & Regards,
> Shaik Mohammad Mujeeb
> Member Technical Staff
> Zoho Corp
>
>
>
>
>
>  On Fri, 16 May 2025 15:12:41 +0530 *Ilia Evdokimov
> >* wrote ---
>
>
> On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:
>
>
> I don't think the comment is necessary here. There are no arithmetic or
> logical operations performed on it. It is only passed as a Datum.
>
> --
> Best regards,
> Ilia Evdokimov,
> Tantor Labs LLC.
>
> Hi Developers,
>
> In pg_stat_statements.c, the function *pg_stat_statements_internal()*
> declares the *queryid* variable as *int64*, but assigns it a value of
> type *uint64*. At first glance, this might appear to be an overflow
> issue. However, I think this is intentional - the queryid is cast to
> *int64* *to match the bigint type of the queryid column in the
> pg_stat_statements view*.
>
> Please find the attached patch, which adds a clarifying comment to make
> the rationale explicit and avoid potential confusion for future readers.
>
>
>
> Thanks and Regards,
> Shaik Mohammad Mujeeb
> Member Technical Staff
> Zoho Corp
>
>
>
>
>

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

2025-05-16 Thread Aleksander Alekseev

Hi,

> A custom SRF seems great to me. You may propose such an aggregate in the
> core - it seems it doesn't even need any syntax changes. For example:
> SELECT * FROM (SELECT sample(q, 10, ) FROM (SELECT ...) AS q);
> or something like that.

I experimented with this idea a bit and it looks like this is not going to work.

In our case sample() should behave both as an aggregate function and
as SRF. An aggregate function computes a single result [1]. We
probably could bypass this by returning an array but it would be
inconvenient for the user. The problem with SRFs is that we never know
when SRF is called the last time e.g. there is no SRF_IS_LASTCALL().
So there is no way to aggregate the tuples until we receive the last
one and then return the sample. I don't think SRF can know this in a
general case (JOINs, SELECT .. LIMIT .., etc). Unless I'm missing
something of course. Perhaps somebody smarter than me could show us
the exact way to implement sample() as an SRF, but at the moment
personally I don't see it.

On top of that it's worth noting that the query you showed above
wouldn't be such a great UI in my opinion.

If I'm right about the limitations of aggregate functions and SRFs
this leaves us the following options:

1. Changing the constraints of aggregate functions or SRFs. However I
don't think we want to do it for such a single niche scenario.
2. Custom syntax and a custom node.
3. To give up

Thoughts?

[1]: https://www.postgresql.org/docs/current/functions-aggregate.html
-- 
Best regards,
Aleksander Alekseev

Re: Assertion failure in smgr.c when using pg_prewarm with partitioned tables

2025-05-16 Thread Fujii Masao

On 2025/05/16 15:33, Dilip Kumar wrote:

On Fri, May 16, 2025 at 11:51 AM Masahiro Ikeda
 wrote:

Thank you for your comments!

I updated the patch to use RELKIND_HAS_STORAGE() as done in
commit 4623d7144. Please see the v2-0001 patch for the changes.

Thanks for updating the patch!

You added a test for pg_prewarm() with a partitioned table in the TAP tests.
But, wouldn't it be simpler and easier to maintain if we added this
as a regular regression test (i.e., using .sql and .out files) instead of
TAP tests? That should be sufficient for this case?

Also, since the issue was introduced in v17, this patch should be
back-patched to v17, right?

On 2025-05-15 19:57, Richard Guo wrote:

+1.  FWIW, not long ago we fixed a similar Assert failure in
contrib/pg_freespacemap by verifying RELKIND_HAS_STORAGE() before
trying to access the storage (see 4623d7144).  Wondering if there are
other similar issues elsewhere in contrib ...

I tested all contrib functions that take regclass arguments (see
attached test.sql and test_result.txt). The result shows that only
pg_prewarm() can lead to the assertion failure.

Right, even while I was searching, I see this is used in 3 contrib
modules, amcheck, autoprewarm, and pg_prewarm. amcheck is already
checking for AM type, and Autoprewarm is identifying the relation by
block, so there is no chance of getting the relation which do not have
storage.

However, I found that amcheck's error messages can be misleading
when run on partitioned indexes.

For example, on the master branch, amcheck (e.g., bt_index_check
function)
shows this error:

   > ERROR:  expected "btree" index as targets for verification
   > DETAIL:  Relation "pgbench_accounts_pkey" is a btree index.

This message says it expects a "btree" index, yet also states the
relation
is a "btree" index, which can seem contradictory. The actual issue is
that
the index is a btree partitioned index, but this detail is missing,
causing
possible confusion.

Yes, I found the error message confusing during testing as well, so it
makes sense to improve it.

+1

Regarding the patch, how about also adding a regression test for
amcheck with a partitioned index?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Conflict detection for update_deleted in logical replication

2025-05-16 Thread Amit Kapila

On Thu, May 15, 2025 at 6:02 PM Amit Kapila  wrote:
>
> Few minor comments on 0001:
> 1.
> + if (TimestampDifferenceExceeds(data->reply_time,
> +data->candidate_xid_time, 0))
> + ereport(ERROR,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("oldest_nonremovable_xid transaction ID may be advanced 
> prematurely"),
>
> Shouldn't this be elog as this is an internal message? And, instead of
> "... may be ..", shall we use ".. could be .." in the error message as
> the oldest_nonremovable_xid is not yet advanced by this time.
>
> 2.
> + * It's necessary to use FullTransactionId here to mitigate potential race
> + * conditions. Such scenarios might occur if the replication slot is not
> + * yet created by the launcher while the apply worker has already
> + * initialized this field.
>
> IIRC, we discussed why it isn't easy to close this race condition. Can
> we capture that in comments separately?
>

A few more comments:
=
3.
maybe_advance_nonremovable_xid(RetainConflictInfoData *data,
   bool status_received)
{
/* Exit early if retaining conflict information is not required */
if (!MySubscription->retainconflictinfo)
return;

/*
* It is sufficient to manage non-removable transaction ID for a
* subscription by the main apply worker to detect update_deleted conflict
* even for table sync or parallel apply workers.
*/
if (!am_leader_apply_worker())
return;

/* Exit early if we have already stopped retaining */
if (MyLogicalRepWorker->stop_conflict_info_retention)
return;
...

get_candidate_xid()
{
...
if (!TimestampDifferenceExceeds(data->candidate_xid_time, now,
data->xid_advance_interval))
return;

Would it be better to encapsulate all these preliminary checks that
decide whether we can move to computing oldest_nonremovable_xid in a
separate function? The check in get_candidate_xid would require some
additional conditions because it is not required in every phase.
Additionally, we can move the core phase processing logic to compute
in a separate function. We can try this to see if the code looks
better with such a refactoring.

4.
+ /*
+ * Check if all remote concurrent transactions that were active at the
+ * first status request have now completed. If completed, proceed to the
+ * next phase; otherwise, continue checking the publisher status until
+ * these transactions finish.
+ */
+ if (FullTransactionIdPrecedesOrEquals(data->last_phase_at,
+   remote_full_xid))
+ data->phase = RCI_WAIT_FOR_LOCAL_FLUSH;

I think there is a possibility of optimization here for cases where
there are no new transactions on the publisher side across the next
cycle of advancement of oldest_nonremovable_xid. We can simply set
candidate_xid as oldest_nonremovable_xid instead of going into
RCI_WAIT_FOR_LOCAL_FLUSH phase. If you want to keep the code simple
for the first version, then at least note that down in comments, but
OTOH, if it is simple to achieve, then let's do it now.

-- 
With Regards,
Amit Kapila.

Make wal_receiver_timeout configurable per subscription

2025-05-16 Thread Fujii Masao


Hi,

When multiple subscribers connect to different publisher servers,
it can be useful to set different wal_receiver_timeout values for
each connection to better detect failures. However, this isn't
currently possible, which limits flexibility in managing subscriptions.

To address this, I'd like to propose making wal_receiver_timeout
configurable per subscription.

One approach is to add wal_receiver_timeout as a parameter to
CREATE SUBSCRIPTION command, storing it in pg_subscription
so each logical replication worker can use its specific value.

Another option is to change the wal_receiver_timeout's GUC context
from PGC_SIGHUP to PGC_USERSET. This would allow setting different
values via ALTER ROLE SET command for each subscription owner -
effectively enabling per-subscription configuration. Since this
approach is simpler and likely sufficient, I'd prefer starting with this.
Thought?

BTW, this could be extended in the future to other GUCs used by
logical replication workers, such as wal_retrieve_retry_interval.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Align wording on copyright organization

2025-05-16 Thread Daniel Gustafsson

> On 16 May 2025, at 10:40, Daniel Gustafsson  wrote:

> I will go ahead and push this to all supported branches.

Done.  I also ensured that the text in the legal notice match the OSI license
text while in there.

--
Daniel Gustafsson

Proposal: Make cfbot fail on patches not created by "git format-patch"

2025-05-16 Thread Jelte Fennema-Nio

In the "Scaling PostgreSQL Development" unconference session. One of the
problems that came up was that people don't follow "best practices". The
response to that was that people don't know what the best practices are
(nor that they are important to follow), because we don't enforce them.
Based on the discussion there I'm planning to make the cfbot fail to apply
a patch in the following two cases:

1. If a patch is not created by "git format-patch" (but cfbot will still
use "patch" to apply the patch in case "git am" fails)
2. If the commit message has no body (so only a title)

Does anyone have strong opposition to this? To be clear, it means we don't
run CI on patches created by  piping "git diff" to a file anymore, as a way
to nudge submitters into providing useful commit messages.

Communication wise, I plan to show this in the CF app as "Fails apply"
(instead of "Needs Rebase"). When clicking the "Fails apply" link, it would
then show a log as to why the apply failed. need to be created using git
format patch, and should have a descriptive commit message.

Re: PG 17.2 compilation fails with -std=c11 on mac

2025-05-16 Thread Tom Lane

Lakshmi Narayana Velayudam  writes:
> When I trying to compiling postgres 17.2 with -std=c11 I am getting the
> below error on mac

> explicit_bzero.c:22:9: error: call to undeclared function 'memset_s'; ISO
> C99 and later do not support implicit function declarations
> [-Wimplicit-function-declaration]

Yeah, I can reproduce that.  After some digging, I see that
the problem is explained by "man memset_s":

SYNOPSIS

 #define __STDC_WANT_LIB_EXT1__ 1

 #include 

 errno_t
 memset_s(void *s, rsize_t smax, int c, rsize_t n);

We lack that #define, which results in  not supplying
the declaration.  AFAICT the requirement for this is in the C11
standard, this is not just Apple doing something weird.

(I did not figure out how come the code compiles without -std=c11.)

Aside from this compilation error, we're probably failing to use
memset_s on some platforms where it's available, for lack of
the #define in our configure/meson probes.  I know how to fix
that in configure, but I'm less clear on how nonstandard probes
work in meson --- any help there?

regards, tom lane

Re: Align wording on copyright organization

2025-05-16 Thread Dave Page

On Fri, 16 May 2025 at 11:50, Daniel Gustafsson  wrote:

> > On 16 May 2025, at 10:40, Daniel Gustafsson  wrote:
>
> > I will go ahead and push this to all supported branches.
>
> Done.  I also ensured that the text in the legal notice match the OSI
> license
> text while in there.
>

Great, thank you very much.

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
pgEdge: https://www.pgedge.com

Re: Proposal: Make cfbot fail on patches not created by "git format-patch"

2025-05-16 Thread Daniel Gustafsson

> On 16 May 2025, at 11:52, Jelte Fennema-Nio  wrote:

> Does anyone have strong opposition to this? To be clear, it means we don't 
> run CI on patches created by  piping "git diff" to a file anymore, as a way 
> to nudge submitters into providing useful commit messages.

Disclaimer: I wasn't in the session (due to conflicting interesting sessions)
so I might be raising points/questions already answered.

Is this really lowering the bar for new contributors?  I've always held "be
liberal in what you accept" as a gold standard for projects I'm involved in, to
remove barriers to entry.  Good commit messages are obviously very important,
but having your patch rejected (yes, I know, failing to apply) might not be
strongest motivator for achieving this.

--
Daniel Gustafsson

Re: C extension compilation failed while using PG 17.2 on mac m1

2025-05-16 Thread Tom Lane

Lakshmi Narayana Velayudam  writes:
> Hello, while compiling my c extension with PG 17,2 I am getting
> postgresql-17.2/pgsql/include/server/port/pg_iovec.h:93:10: error: call to
> undeclared function 'pwritev'; ISO C99 and later do not support implicit
> function declarations [-Wimplicit-function-declaration]

I failed to reproduce this.  I wonder if you have some kind of version
skew problem --- SDK not matching underlying system, for instance.
preadv and pwritev do have some weirdness on Apple platforms,
but we should handle that on any PG >= v15 (cf. commit f014b1b9b).

regards, tom lane

Re: wrong query results on bf leafhopper

2025-05-16 Thread Alena Rybakina

is there different tables "Seq Scan on tenk1 t2" and "Seq Scan on tenk1 
t1", so it might not be a bug, isn't it?


On 16.05.2025 09:19, Andres Freund wrote:

Hi,

I noticed this recent BF failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2025-05-15%2008%3A10%3A04

=== dumping 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/regression.diffs
 ===
diff -U3 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/regress/expected/memoize.out
 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/results/memoize.out
--- 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/regress/expected/memoize.out
   2025-05-15 08:10:04.211926695 +
+++ 
/home/bf/proj/bf/build-farm-17/HEAD/pgsql.build/src/test/recovery/tmp_check/results/memoize.out
 2025-05-15 08:18:29.117733601 +
@@ -42,7 +42,7 @@
 ->  Nested Loop (actual rows=1000.00 loops=N)
   ->  Seq Scan on tenk1 t2 (actual rows=1000.00 loops=N)
 Filter: (unique1 < 1000)
-   Rows Removed by Filter: 9000
+   Rows Removed by Filter: 8982
   ->  Memoize (actual rows=1.00 loops=N)
 Cache Key: t2.twenty
 Cache Mode: logical
@@ -178,7 +178,7 @@
 ->  Nested Loop (actual rows=1000.00 loops=N)
   ->  Seq Scan on tenk1 t1 (actual rows=1000.00 loops=N)
 Filter: (unique1 < 1000)
-   Rows Removed by Filter: 9000
+   Rows Removed by Filter: 8981
   ->  Memoize (actual rows=1.00 loops=N)
 Cache Key: t1.two, t1.twenty
 Cache Mode: binary


For a moment I thought this could be a bug in memoize, but that doesn't
actually make sense - the failure isn't in memoize, it's the seqscan.

Subsequently I got worried that this is an AIO bug or such causing wrong query
results. But there are instances of this error well before AIO was
merged. E.g.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2024-12-18%2023%3A35%3A04

The same error is also present down to 16.

In 15, I saw a potentially related error
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=leafhopper&dt=2024-12-16%2023%3A43%3A03


There have been other odd things on leafhopper, see e.g.:
https://www.postgresql.org/message-id/35d87371-f3ab-42c8-9aac-bb39ab5bd987%40gmail.com
https://postgr.es/m/Z4npAKvchWzKfb_r%40paquier.xyz

Greetings,

Andres Freund



--
Regards,
Alena Rybakina
Postgres Professional

Re: Proposal: Make cfbot fail on patches not created by "git format-patch"

2025-05-16 Thread Tom Lane

Jelte Fennema-Nio  writes:
> Based on the discussion there I'm planning to make the cfbot fail to apply
> a patch in the following two cases:
> ...
> To be clear, it means we don't
> run CI on patches created by  piping "git diff" to a file anymore, as a way
> to nudge submitters into providing useful commit messages.

That outcome seems entirely horrible to me.  If you want to flag the lack
of a commit message somehow, fine, but don't prevent CI from running.

regards, tom lane

Re: Conflict detection for update_deleted in logical replication

2025-05-16 Thread shveta malik

On Fri, May 16, 2025 at 12:17 PM Amit Kapila  wrote:
>
> On Fri, Apr 25, 2025 at 10:08 AM shveta malik  wrote:
> >
> > On Thu, Apr 24, 2025 at 6:11 PM Zhijie Hou (Fujitsu)
> >  wrote:
> >
> > > > Few comments for patch004:
> > > > Config.sgml:
> > > > 1)
> > > > +   
> > > > +Maximum duration (in milliseconds) for which conflict
> > > > +information can be retained for conflict detection by the 
> > > > apply worker.
> > > > +The default value is 0, indicating that 
> > > > conflict
> > > > +information is retained until it is no longer needed for 
> > > > detection
> > > > +purposes.
> > > > +   
> > > >
> > > > IIUC, the above is not entirely accurate. Suppose the subscriber 
> > > > manages to
> > > > catch up and sets oldest_nonremovable_xid to 100, which is then updated 
> > > > in
> > > > slot. After this, the apply worker takes a nap and begins a new xid 
> > > > update cycle.
> > > > Now, let’s say the next candidate_xid is 200, but this time the 
> > > > subscriber fails
> > > > to keep up and exceeds max_conflict_retention_duration. As a result, it 
> > > > sets
> > > > oldest_nonremovable_xid to InvalidTransactionId, and the launcher skips
> > > > updating the slot’s xmin.
> > >
> > > If the time exceeds the max_conflict_retention_duration, the launcher 
> > > would
> > > Invalidate the slot, instead of skipping updating it. So the conflict 
> > > info(e.g.,
> > > dead tuples) would not be retained anymore.
> > >
> >
> > launcher will not invalidate the slot until all subscriptions have
> > stopped conflict_info retention. So info of dead tuples for a
> > particular oldest_xmin of a particular apply worker could be retained
> > for much longer than this configured duration. If other apply workers
> > are actively working (catching up with primary), then they should keep
> > on advancing xmin of shared slot but if xmin of shared slot remains
> > same for say 15min+15min+15min for 3 apply-workers (assuming they are
> > marking themselves with stop_conflict_retention one after other and
> > xmin of slot has not been advanced), then the first apply worker
> > having marked itself with stop_conflict_retention still has access to
> > the oldest_xmin's data for 45 mins instead of 15 mins. (where
> > max_conflict_retention_duration=15 mins). Please let me know if my
> > understanding is wrong.
> >
>
> IIUC, the current code will stop updating the slot even if one of the
> apply workers has set stop_conflict_info_retention. The other apply
> workers will keep on maintaining their oldest_nonremovable_xid without
> advancing the slot. If this is correct, then what behavior instead we
> expect here?

I think this is not the current behaviour.

> Do we want the slot to keep advancing till any worker is
> actively maintaining oldest_nonremovable_xid?

In fact, this one is the current behaviour of v30 patch.

> To some extent, this
> matches with the cases where the user has set retain_conflict_info for
> some subscriptions but not for others.
>
> If so, how will users eventually know for which tables they can expect
> to reliably detect update_delete? One possibility is that users can
> check which apply workers have stopped maintaining
> oldest_nonremovable_xid via pg_stat_subscription view and then see the
> tables corresponding to those subscriptions.

Yes, it is a possibility, but I feel it will be too much to monitor
from the user's perspective.

> Also, what will we do as
> part of the resolutions in the applyworkers where
> stop_conflict_info_retention is set? Shall we simply LOG that we can't
> resolve this conflict and continue till the user takes some action, or
> simply error out in such cases?

We can LOG. Erroring out again will prevent the subscriber from
proceeding, and the subscriber initially reached this state due to
falling behind, which led to stop_conflict_retention=true. But still
if we go with erroring out, I am not very sure what action users can
take in this situation? Subscriber is still lagging and if the user
recreates the slot as a solution, apply worker will soon go to
'stop_conflict_retention=true' state again, provided the subscriber is
still not able to catch-up.

thanks
Shveta

Re: Conflict detection for update_deleted in logical replication

2025-05-16 Thread Amit Kapila

On Fri, May 16, 2025 at 2:40 PM Amit Kapila  wrote:
>
> On Fri, May 16, 2025 at 12:01 PM shveta malik  wrote:
> >
> > On Fri, May 16, 2025 at 11:15 AM Amit Kapila  
> > wrote:
> > >
> > >
> > > BTW, another related point is that when we decide to stop retaining
> > > dead tuples (via should_stop_conflict_info_retention), should we also
> > > consider the case that the apply worker didn't even try to get the
> > > publisher status because previously it decided that
> > > oldest_nonremovable_xid cannot be advanced due to its
> > > OldestActiveTransactionId?
> > >
> >
> > Do you mean avoid  stop-conflict-retention in such a case as apply
> > worker itself did not request status from the publisher? If I
> > understood your point correctly, then we can do that by advancing the
> > timer to a new value even if we did not update candidate-xid and did
> > not ask the status from the publisher.
> >
>
> But candidate_xid_time is also used in wait_for_local_flush() to check
> clock_skew between publisher and subscriber, so for that purpose, it
> is better to set it along with candidate_xid. However, can't we rely
> on the valid value of candidate_xid to ensure that apply worker didn't
> send any request? Note that we always reset candidate_xid once we have
> updated oldest_nonremovable_xid.
>

I think this is automatically taken care of because we call
should_stop_conflict_info_retention() only during 'wait' phase, which
should be done after candidate_xid is set. Having said that, we should
have assert for candidate_xid in should_stop_conflict_info_retention()
and also add in comments that it should be called only during the
'wait' phase. Additionally, we can also have an assert that
should_stop_conflict_info_retention() is called only during the 'wait'
phase.

-- 
With Regards,
Amit Kapila.

Re: Align wording on copyright organization

2025-05-16 Thread Daniel Gustafsson

> On 16 May 2025, at 10:00, Tom Lane  wrote:
> 
> Dave Page  writes:
>> On Fri, 16 May 2025 at 09:12, Daniel Gustafsson  wrote:
>>> As was briefly discussed in the developers meeting, and the hallway track,
>>> at
>>> PGConf.dev we have a few variations on the organization wording which
>>> really
>>> should be aligned with the wording defined by -core in pgweb commit
>>> 2d764dbc08.
> 
>> Thanks Daniel - the patch looks good to me.
>> You'll push this to all active branches, correct?
> 
> +1 --- thanks Daniel, I had this on my to-do list also.

Thanks for the review, I will go ahead and push this to all supported branches.

--
Daniel Gustafsson

46 matches

Mail list logo