Query execution in Perl TAP tests needs work

2023-08-27 Thread Thomas Munro
Hi,

Every time we run a SQL query, we fork a new psql process and a new
cold backend process.  It's not free on Unix, and quite a lot worse on
Windows, at around 70ms per query.  Take amcheck/001_verify_heapam for
example.  It runs 272 subtests firing off a stream of queries, and
completes in ~51s on Windows (!), and ~6-9s on the various Unixen, on
CI.

Here are some timestamps I captured from CI by instrumenting various
Perl and C bits:

0.000s: IPC::Run starts
0.023s:   postmaster socket sees connection
0.025s:   postmaster has created child process
0.033s: backend starts running main()
0.039s: backend has reattached to shared memory
0.043s: backend connection authorized message
0.046s: backend has executed and logged query
0.070s: IPC::Run returns

I expected process creation to be slow on that OS, but it seems like
something happening at the end is even slower.  CI shows Windows
consuming 4 CPUs at 100% for a full 10 minutes to run a test suite
that finishes in 2-3 minutes everywhere else with the same number of
CPUs.  Could there be an event handling snafu in IPC::Run or elsewhere
nearby?  It seems like there must be either a busy loop or a busted
sleep/wakeup... somewhere?  But even if there's a weird bug here
waiting to be discovered and fixed, I guess it'll always be too slow
at ~10ms per process spawned, with two processes to spawn, and it's
bad enough on Unix.

As an experiment, I hacked up a not-good-enough-to-share experiment
where $node->safe_psql() would automatically cache a BackgroundPsql
object and reuse it, and the times for that test dropped ~51 -> ~9s on
Windows, and ~7 -> ~2s on the Unixen.  But even that seems non-ideal
(well it's certainly non-ideal the way I hacked it up anyway...).  I
suppose there are quite a few ways we could do better:

1.  Don't fork anything at all: open (and cache) a connection directly
from Perl.
1a.  Write xsub or ffi bindings for libpq.  Or vendor (parts) of the
popular Perl xsub library?
1b.  Write our own mini pure-perl pq client module.  Or vendor (parts)
of some existing one.
2.  Use long-lived psql sessions.
2a.  Something building on BackgroundPsql.
2b.  Maybe give psql or a new libpq-wrapper a new low level stdio/pipe
protocol that is more fun to talk to from Perl/machines?

In some other languages one can do FFI pretty easily so we could use
the in-tree libpq without extra dependencies:

>>> import ctypes
>>> libpq = ctypes.cdll.LoadLibrary("/path/to/libpq.so")
>>> libpq.PQlibVersion()
17

... but it seems you can't do either static C bindings or runtime FFI
from Perl without adding a new library/package dependency.  I'm not
much of a Perl hacker so I don't have any particular feeling.  What
would be best?

This message brought to you by the Lorax.




Re: DecodeInterval fixes

2023-08-27 Thread Michael Paquier
On Sun, Aug 27, 2023 at 04:14:00PM -0400, Joseph Koshakow wrote:
> On Tue, Aug 22, 2023 at 12:58 PM Jacob Champion 
> wrote:
>> I wouldn't argue for backpatching, for sure, but I guess I saw this as
>> falling into the same vein as 5b3c5953 and bcc704b52 which were
>> already committed.
> 
> I agree, I don't think we should try and backport this. As Jacob
> highlighted, we've merged similar patches for other date time types.
> If applications were relying on this behavior, the upgrade may be a
> good time for them to re-evaluate their usage since it's outside the
> documented spec and they may not be getting the units they're expecting
> from intervals like '1 day month'.

I felt like asking anyway.  I have looked at the patch series and the
past compatibility changes in this area, and I kind of agree that this
seems like an improvement against confusing interval values.  So, I
have applied 0001, 0002 and 0003 after more review.

0002 was a bit careless with the code indentation.

In 0003, I was wondering a bit if we'd better set parsing_unit_val to
false for AGO, but as we do a backward lookup and because after 0002
AGO can only be last, I've let the code as you have suggested, relying
on the initial value of this variable.
--
Michael


signature.asc
Description: PGP signature


RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-27 Thread Hayato Kuroda (Fujitsu)
Dear Dilip,

Thank you for reading the thread!

> I haven't read this thread in detail, but I have one high-level design
> question.  The upgrade of the replication slot is default or is it
> under some GUC?

I designed that logical slots were upgraded by default.

> because if it is by default then some of the users
> might experience failure in some cases e.g. a) wal_level in the new
> cluster is not logical b) If this new check
> check_old_cluster_for_confirmed_flush_lsn() fails due to confirm flush
> LSN is not at the latest shutdown checkpoint.  I am not sure whether this
> is a problem or could be just handled by documenting this behavior.

I think it should be done by default to avoid WAL hole. If we do not provide the
upgrading by default, users may forget to specify the option. After sometime
he/she would notice that slots are not migrated and would create slots at that 
time,
but this leads data loss of subscriber. The inconsistency between nodes is 
really
bad. Developers requested to enable by default [1].

Moreover, checking related with logical slots are skipped when slots are not 
defined
on the old cluster. So it do not affect when users do not use logical slots.

Also, we are considering that an option for excluding slots is introduced after
committed once [2].

[1]: 
https://www.postgresql.org/message-id/ad83b9f2-ced3-c51c-342a-cc281ff562fc%40postgresql.org
[2]: 
https://www.postgresql.org/message-id/CAA4eK1KxP%2BgogYOsTHbZVPO7Pp38gcRjEWUxv%2B4X3dFept3z3A%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-27 Thread Dilip Kumar
On Sat, Aug 26, 2023 at 9:54 AM Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear Peter,
>
> Thank you for reviewing! PSA new version patch set.

I haven't read this thread in detail, but I have one high-level design
question.  The upgrade of the replication slot is default or is it
under some GUC?  because if it is by default then some of the users
might experience failure in some cases e.g. a) wal_level in the new
cluster is not logical b) If this new check
check_old_cluster_for_confirmed_flush_lsn() fails due to confirm flush
LSN is not at the latest shutdown checkpoint.  I am not sure whether this
is a problem or could be just handled by documenting this behavior.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: pg_stat_get_backend_subxact() and backend IDs?

2023-08-27 Thread Michael Paquier
On Fri, Aug 25, 2023 at 10:56:14PM +, Imseih (AWS), Sami wrote:
> > Here is a new version of the patch that avoids changing the names of the
> > existing functions. I'm not thrilled about the name
> > (pgstat_fetch_stat_local_beentry_by_backend_id), so I am open to
> > suggestions. In any case, I'd like to rename all three of the>
> > pgstat_fetch_stat_* functions in v17.
> 
> Thanks for the updated patch.
> 
> I reviewed/tested the latest version and I don't have any more comments.

FWIW, I find the new routine introduced by this patch rather
confusing.  pgstat_fetch_stat_local_beentry() and
pgstat_fetch_stat_local_beentry_by_backend_id() use the same 
argument name for a BackendId or an int.  This is not entirely the
fault of this patch as pg_stat_get_backend_subxact() itself is
confused about "beid" being a uint32 or a BackendId.  However, I think
that this makes much harder to figure out that
pgstat_fetch_stat_local_beentry() is only here because it is cheaper 
to do sequential scan of all the local beentries rather than a
bsearch() for all its callers, while
pgstat_fetch_stat_local_beentry_by_backend_id() is here because we
want to retrieve the local beentry matching with the *backend ID* with
the binary search().

I understand that this is not a fantastic naming, but renaming
pgstat_fetch_stat_local_beentry() to something like
pgstat_fetch_stat_local_beentry_by_{index|position}_id() would make
the difference much easier to grasp, and we should avoid the use of
"beid" when we refer to the *position/index ID* in
localBackendStatusTable, because it is not a BackendId at all, just a
position in the local array.
--
Michael


signature.asc
Description: PGP signature


Logger process and "LOG: could not close client or listen socket: Bad file descriptor"

2023-08-27 Thread Michael Paquier
Hi all,
(Heikki in CC.)

After b0bea38705b2, I have noticed that the syslogger is generating a
lot of dummy LOG entries:
2023-08-28 09:40:52.565 JST [24554]
LOG:  could not close client or listen socket: Bad file descriptor  

The only reason why I have noticed this issue is because I enable the
logging collector in my development scripts.  Note that the pg_ctl
test 004_logrotate.pl, the only one with logging_collector set, is
equally able to reproduce the issue.

The root of the problem is ClosePostmasterPorts() in syslogger.c,
where we close the postmaster ports, but all of them are still set at
0, leading to these spurious logs.

From what I can see, this is is a rather old issue, because
ListenSocket[] is filled with PGINVALID_SOCKET *after* starting the
syslogger.  It seems to me that we should just initialize the array
before starting the syslogger, so as we don't get these incorrect
logs?

Thoughts?  Please see the attached.
--
Michael
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 41bccb46a8..acd46718fc 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1141,6 +1141,17 @@ PostmasterMain(int argc, char *argv[])
  errmsg("could not remove file \"%s\": %m",
 		LOG_METAINFO_DATAFILE)));
 
+	/*
+	 * Establish input sockets.
+	 *
+	 * First, mark them all closed, and set up an on_proc_exit function that's
+	 * charged with closing the sockets again at postmaster shutdown.
+	 */
+	for (i = 0; i < MAXLISTEN; i++)
+		ListenSocket[i] = PGINVALID_SOCKET;
+
+	on_proc_exit(CloseServerPorts, 0);
+
 	/*
 	 * If enabled, start up syslogger collection subprocess
 	 */
@@ -1173,17 +1184,6 @@ PostmasterMain(int argc, char *argv[])
 	ereport(LOG,
 			(errmsg("starting %s", PG_VERSION_STR)));
 
-	/*
-	 * Establish input sockets.
-	 *
-	 * First, mark them all closed, and set up an on_proc_exit function that's
-	 * charged with closing the sockets again at postmaster shutdown.
-	 */
-	for (i = 0; i < MAXLISTEN; i++)
-		ListenSocket[i] = PGINVALID_SOCKET;
-
-	on_proc_exit(CloseServerPorts, 0);
-
 	if (ListenAddresses)
 	{
 		char	   *rawstring;


signature.asc
Description: PGP signature


Re: subscription/015_stream sometimes breaks

2023-08-27 Thread Peter Smith
On Fri, Aug 25, 2023 at 8:15 PM Amit Kapila  wrote:
>
> On Thu, Aug 24, 2023 at 3:48 PM Amit Kapila  wrote:
> >
> > On Thu, Aug 24, 2023 at 1:20 PM Alvaro Herrera  
> > wrote:
> > >
> > > On 2023-Aug-24, Amit Kapila wrote:
> > >
> > > > On Wed, Aug 23, 2023 at 1:31 PM Alvaro Herrera 
> > > >  wrote:
> > >
> > > > > Hmm, I think if worker->in_use is false, we shouldn't consult the rest
> > > > > of the struct at all, so I propose to add the attached 0001 as a 
> > > > > minimal
> > > > > fix.
> > > >
> > > > I think that way we may need to add the check for in_use before
> > > > accessing each of the LogicalRepWorker struct fields or form some rule
> > > > about which fields (or places) are okay to access without checking
> > > > in_use field.
> > >
> > > As far as I realize, we have that rule already.  It's only a few
> > > relatively new places that have broken it.  I understand that the in_use
> > > concept comes from the one of the same name in ReplicationSlot, except
> > > that it is not at all documented in worker_internal.h.
> > >
> > > So I propose we do both: apply Zhijie's patch and my 0001 now; and
> > > somebody gets to document the locking design for LogicalRepWorker.
> > >
> >
> > Agreed.
> >
>
> Pushed both the patches.
>

IMO there are inconsistencies in the second patch that was pushed.

1. In the am_xxx functions, why is there Assert 'in_use' only for the
APPLY / PARALLEL_APPLY workers but not for TABLESYNC workers?

2. In the am_xxx functions there is now Assert 'in_use', so why are we
still using macros to check again what we already asserted is not
possible? (Or, if the checking overkill was a deliberate choice then
why is there no isLeaderApplyWorker macro?)

~

PSA a small patch to address these.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Fix-am_xxx-function-Asserts.patch
Description: Binary data


Re: New WAL record to detect the checkpoint redo location

2023-08-27 Thread Michael Paquier
On Fri, Aug 25, 2023 at 11:08:25AM +0530, Dilip Kumar wrote:
> Here is the updated version of the patch.

The concept of the patch looks sound to me.  I have a few comments. 

+* This special record, however, is not required when we doing a 
shutdown
+* checkpoint, as there will be no concurrent wal insertions during that
+* time.  So, the shutdown checkpoint LSN will be the same as
+* checkpoint-redo LSN.

This is missing "are", as in "when we are doing a shutdown
checkpoint".

-   freespace = INSERT_FREESPACE(curInsert);
-   if (freespace == 0)

The variable "freespace" can be moved within the if block introduced
by this patch when calculating the REDO location for the shutdown
case.  And you can do the same with curInsert?

-* Compute new REDO record ptr = location of next XLOG record.
-*
-* NB: this is NOT necessarily where the checkpoint record itself will 
be,
-* since other backends may insert more XLOG records while we're off 
doing
-* the buffer flush work.  Those XLOG records are logically after the
-* checkpoint, even though physically before it.  Got that?
+* In case of shutdown checkpoint, compute new REDO record
+* ptr = location of next XLOG record.

It seems to me that keeping around this comment is important,
particularly for the case where we have a shutdown checkpoint and we
expect nothing to run, no?

How about adding a test in pg_walinspect?  There is an online
checkpoint running there, so you could just add something like that
to check that the REDO record is at the expected LSN stored in the
control file:
+-- Check presence of REDO record.
+SELECT redo_lsn FROM pg_control_checkpoint() \gset
+SELECT start_lsn = :'redo_lsn'::pg_lsn AS same_lsn, record_type
+  FROM pg_get_wal_record_info(:'redo_lsn');
--
Michael


signature.asc
Description: PGP signature


Re: CI speed improvements for FreeBSD

2023-08-27 Thread Thomas Munro
And after adding this to the commitfest, here's the first cfbot run.
The gain was due to "test_world" which shows a greater-than-2x speedup
(~4:30 -> ~2:08) from 2x CPUs.  That is nice for humans who want the
answer as soon as possible, but note that the resource usage cost
might go up because of the non-parallel parts now wasting more idle
CPUs: git clone, meson configure etc (as they do on every platform).

https://cirrus-ci.com/build/6060109692928000




CI speed improvements for FreeBSD

2023-08-27 Thread Thomas Munro
Hi,

Here are a couple of changes that got FreeBSD down to 4:29 total, 2:40
in test_world in my last run (over 2x speedup), using a RAM disk
backed by a swap partition, and more CPUs.  It's still a regular UFS
file system but FreeBSD is not as good at avoiding I/O around short
lived files and directories as Linux: it can get hung up on a bunch of
synchronous I/O, and also flushes disk caches for those writes,
without an off switch.

I don't know about Windows, but I suspect the same applies there, ie
synchronous I/O blocking system calls around our blizzard of file
creations and unlinks.  Anyone know how to try it?
From d47d01edbbf88d1cfc5fa2c48024e3bc85b52eae Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sun, 27 Aug 2023 11:01:24 +1200
Subject: [PATCH 1/2] ci: Use a RAM disk on FreeBSD.

Run the tests in a RAM disk.  It's still a UFS file system and is backed
by 20GB of disk, but it avoids a lot of I/O.  This shaves over a minute
off the test_world time, and scales better.
---
 src/tools/ci/gcp_freebsd_repartition.sh | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/src/tools/ci/gcp_freebsd_repartition.sh b/src/tools/ci/gcp_freebsd_repartition.sh
index 2d5e173899..91c0e7f93c 100755
--- a/src/tools/ci/gcp_freebsd_repartition.sh
+++ b/src/tools/ci/gcp_freebsd_repartition.sh
@@ -3,26 +3,24 @@
 set -e
 set -x
 
-# The default filesystem on freebsd gcp images is very slow to run tests on,
-# due to its 32KB block size
-#
-# XXX: It'd probably better to fix this in the image, using something like
-# https://people.freebsd.org/~lidl/blog/re-root.html
-
 # fix backup partition table after resize
 gpart recover da0
 gpart show da0
-# kill swap, so we can delete a partition
-swapoff -a || true
-# (apparently we can only have 4!?)
+
+# delete and re-add swap partition with expanded size
+swapoff -a
 gpart delete -i 3 da0
-gpart add -t freebsd-ufs -l data8k -a 4096 da0
+gpart add -t freebsd-swap -l swapfs -a 4096 da0
 gpart show da0
-newfs -U -b 8192 /dev/da0p3
+swapon -a
+
+# create a file system on a memory disk backed by swap, to minimize I/O
+mdconfig -a -t swap -s20G -u md1
+newfs -b 8192 -U /dev/md1
 
-# Migrate working directory
+# migrate working directory
 du -hs $CIRRUS_WORKING_DIR
 mv $CIRRUS_WORKING_DIR $CIRRUS_WORKING_DIR.orig
 mkdir $CIRRUS_WORKING_DIR
-mount -o noatime /dev/da0p3 $CIRRUS_WORKING_DIR
+mount -o noatime /dev/md1 $CIRRUS_WORKING_DIR
 cp -r $CIRRUS_WORKING_DIR.orig/* $CIRRUS_WORKING_DIR/
-- 
2.41.0

From ee51fc23a3297e1acdeffc5b4ca09129ba7c889a Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sun, 27 Aug 2023 12:09:39 +1200
Subject: [PATCH 2/2] ci: Use more CPUs on FreeBSD.

Reduce test_world time by using all available CPUs.
---
 .cirrus.tasks.yml | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e137769850..69e2bcb7ad 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -126,11 +126,9 @@ task:
   name: FreeBSD - 13 - Meson
 
   env:
-# FreeBSD on GCP is slow when running with larger number of CPUS /
-# jobs. Using one more job than cpus seems to work best.
-CPUS: 2
-BUILD_JOBS: 3
-TEST_JOBS: 3
+CPUS: 4
+BUILD_JOBS: 4
+TEST_JOBS: 8
 IMAGE_FAMILY: pg-ci-freebsd-13
 DISK_SIZE: 50
 
-- 
2.41.0



Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-08-27 Thread Heikki Linnakangas

On 27/08/2023 16:41, Joe Conway wrote:

On 8/15/23 10:40, Heikki Linnakangas wrote:

If multiple interpreters are used, is the single perl_locale_obj
variable still enough? Each interpreter can have their own locale I believe.


So in other words plperl and plperlu both used in the same query? I
don't see how we could get from one to the other without going through
the outer "postgres" locale first. Or are you thinking something else?


I think you got that it backwards. 'perl_locale_obj' is set to the perl 
interpreter's locale, whenever we are *outside* the interpreter.


This crashes with the patch:

postgres=# DO LANGUAGE plperlu
$function$
   use POSIX qw(setlocale LC_NUMERIC);
   use locale;

   setlocale LC_NUMERIC, "sv_SE.utf8";
$function$;
DO
postgres=# do language plperl $$ $$;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

I was going to test using plperl and plperl in the same session and 
expected the interpreters to mix up the locales they use. Maybe the 
crash is because of something like that, although I didn't expect a 
crash, just weird confusion on which locale is used.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: DecodeInterval fixes

2023-08-27 Thread Joseph Koshakow
On Tue, Aug 22, 2023 at 12:58 PM Jacob Champion 
wrote:
>
> On Mon, Aug 21, 2023 at 10:39 PM Michael Paquier 
wrote:
> > 0002 and 0003 make this stuff fail, but isn't there a risk that this
> > breaks applications that relied on these accidental behaviors?
> > Assuming that this is OK makes me nervous.
>
> I wouldn't argue for backpatching, for sure, but I guess I saw this as
> falling into the same vein as 5b3c5953 and bcc704b52 which were
> already committed.

I agree, I don't think we should try and backport this. As Jacob
highlighted, we've merged similar patches for other date time types.
If applications were relying on this behavior, the upgrade may be a
good time for them to re-evaluate their usage since it's outside the
documented spec and they may not be getting the units they're expecting
from intervals like '1 day month'.

Thanks,
Joe Koshakow


Re: list of acknowledgments for PG16

2023-08-27 Thread Pavel Borisov
I'm not completely sure what should be in this list, but maybe also
tuplesort extensibility [1]? [1]
https://www.postgresql.org/message-id/flat/CALT9ZEHjgO_r2cFr35%3Du9xZa6Ji2e7oVfSEBRBj0Gc%2BtJjTxSg%40mail.gmail.com#201dc4202af38f224a1e3acc78795199




Re: list of acknowledgments for PG16

2023-08-27 Thread Peter Eisentraut

On 23.08.23 09:13, Denis Laxalde wrote:

Peter Eisentraut a écrit :

The list of acknowledgments for the PG16 release notes has been
committed.  It should show up here sometime:
.
   As usual, please check for problems such as wrong sorting, duplicate
names in different variants, or names in the wrong order etc.  (Our
convention is given name followed by surname.)



"Gabriele Varrazzo" is mentioned in commit 
0032a5456708811ca95bd80a538f4fb72ad0dd20 but it should be "Daniele 
Varrazzo" (per Discussion link in commit message); the later is already 
in the list.


Fixed.




Re: list of acknowledgments for PG16

2023-08-27 Thread Peter Eisentraut

On 22.08.23 15:48, Vik Fearing wrote:

On 8/22/23 11:33, Peter Eisentraut wrote:
The list of acknowledgments for the PG16 release notes has been 
committed.  It should show up here sometime: 
.  As usual, please check for problems such as wrong sorting, duplicate names in different variants, or names in the wrong order etc.  (Our convention is given name followed by surname.)


I think these might be the same person:

     Zhihong Yu
     Zihong Yu

I did not spot any others.


Fixed.




Re: persist logical slots to disk during shutdown checkpoint

2023-08-27 Thread vignesh C
On Fri, 25 Aug 2023 at 17:40, vignesh C  wrote:
>
> On Sat, 19 Aug 2023 at 11:53, Amit Kapila  wrote:
> >
> > It's entirely possible for a logical slot to have a confirmed_flush
> > LSN higher than the last value saved on disk while not being marked as
> > dirty.  It's currently not a problem to lose that value during a clean
> > shutdown / restart cycle but to support the upgrade of logical slots
> > [1] (see latest patch at [2]), we seem to rely on that value being
> > properly persisted to disk. During the upgrade, we need to verify that
> > all the data prior to shudown_checkpoint for the logical slots has
> > been consumed, otherwise, the downstream may miss some data. Now, to
> > ensure the same, we are planning to compare the confirm_flush LSN
> > location with the latest shudown_checkpoint location which means that
> > the confirm_flush LSN should be updated after restart.
> >
> > I think this is inefficient even without an upgrade because, after the
> > restart, this may lead to decoding some data again. Say, we process
> > some transactions for which we didn't send anything downstream (the
> > changes got filtered) but the confirm_flush LSN is updated due to
> > keepalives. As we don't flush the latest value of confirm_flush LSN,
> > it may lead to processing the same changes again.
>
> I was able to test and verify that we were not processing the same
> changes again.
> Note: The 0001-Add-logs-to-skip-transaction-filter-insert-operation.patch
> has logs to print if a decode transaction is skipped and also a log to
> mention if any operation is filtered.
> The test.sh script has the steps for a) setting up logical replication
> for a table b) perform insert on table that need to be published (this
> will be replicated to the subscriber) c) perform insert on a table
> that will not be published (this insert will be filtered, it will not
> be replicated) d) sleep for 5 seconds e) stop the server f) start the
> server
> I used the following steps, do the following in HEAD:
> a) Apply 0001-Add-logs-to-skip-transaction-filter-insert-operation.patch
> patch in Head and build the binaries b) execute test.sh c) view N1.log
> file to see that the insert operations were filtered again by seeing
> the following logs:
> LOG:  Filter insert for table tbl2
> ...
> ===restart===
> ...
> LOG:  Skipping transaction 0/156AD10 as start decode at is greater 0/156AE40
> ...
> LOG:  Filter insert for table tbl2
>
> We can see that the insert operations on tbl2 which was filtered
> before server was stopped is again filtered after restart too in HEAD.
>
> Lets see that the same changes were not processed again with patch:
> a) Apply v4-0001-Persist-to-disk-logical-slots-during-a-shutdown-c.patch
> from [1] also apply
> 0001-Add-logs-to-skip-transaction-filter-insert-operation.patch patch
> and build the binaries b) execute test.sh c) view N1.log file to see
> that the insert operations were skipped after restart of server by
> seeing the following logs:
> LOG:  Filter insert for table tbl2
> ...
> ===restart===
> ...
> Skipping transaction 0/156AD10 as start decode at is greater 0/156AFB0
> ...
> Skipping transaction 0/156AE80 as start decode at is greater 0/156AFB0
>
> We can see that the insert operations on tbl2 are not processed again
> after restart with the patch.

Here is another way to test using pg_replslotdata approach that was
proposed earlier at [1].
I have rebased this on top of HEAD and the v5 version for the same is attached.

We can use the same test as test.sh shared at [2].
When executed with HEAD, it was noticed that confirmed_flush points to
WAL location before both the transaction:
slot_name  slot_typedatoid   persistency xmin  catalog_xmin
  restart_lsn confirmed_flush two_phase_at  two_phase
plugin
-   - -- --   
---   ------
   -  --
sublogical  5  persistent   0
735  0/1531E280/1531E60 0/0
   0   pgoutput

WAL record information generated using pg_walinspect for various
records at and after confirmed_flush WAL 0/1531E60:
 row_number | start_lsn |  end_lsn  | prev_lsn  | xid |
resource_manager | record_type | record_length |
main_data_length | fpi_length |
   description
 |
 block_ref
+---+---+---+-+--+-+---+--++---
+-
  1 | 0/1531E60 | 

Re: Incremental View Maintenance, take 2

2023-08-27 Thread Yugo NAGATA
On Sun, 2 Jul 2023 10:38:20 +0800
jian he  wrote:

> ok. Now I really found a small bug.
> 
> this works as intended:
> BEGIN;
> CREATE INCREMENTAL MATERIALIZED VIEW test_ivm AS SELECT i, MIN(j) as
> min_j  FROM mv_base_a group by 1;
> INSERT INTO mv_base_a select 1,-2 where false;
> rollback;
> 
> however the following one:
> BEGIN;
> CREATE INCREMENTAL MATERIALIZED VIEW test_ivm1 AS SELECT MIN(j) as
> min_j  FROM mv_base_a;
> INSERT INTO mv_base_a  select 1, -2 where false;
> rollback;
> 
> will evaluate
> tuplestore_tuple_count(new_tuplestores) to 1, it will walk through
> IVM_immediate_maintenance function to apply_delta.
> but should it be zero?

This is not a bug because an aggregate without GROUP BY always
results one row whose value is NULL. 

The contents of test_imv1 would be always same as " SELECT MIN(j) as min_j 
FROM mv_base_a;", isn't it?


Regards,
Yugo Nagata

-- 
Yugo NAGATA 




Re: UTF8 national character data type support WIP patch and list of open issues.

2023-08-27 Thread Chapman Flack

Hi,

Although this is a ten-year-old message, it was the one I found quickly
when looking to see what the current state of play on this might be.

On 2013-09-20 14:22, Robert Haas wrote:

Hmm.  So under that design, a database could support up to a total of
two character sets, the one that you get when you say 'foo' and the
other one that you get when you say n'foo'.

I guess we could do that, but it seems a bit limited.  If we're going
to go to the trouble of supporting multiple character sets, why not
support an arbitrary number instead of just two?


Because that old thread came to an end without mentioning how the
standard approaches that, it seemed worth adding, just to complete the
record.

In the draft of the standard I'm looking at (which is also around a
decade old), n'foo' is nothing but a handy shorthand for _csname'foo'
(which is a syntax we do not accept) for some particular csname that
was chosen when setting up the db.

So really, the standard contemplates letting you have columns of
arbitrary different charsets (CHAR(x) CHARACTER SET csname), and
literals of arbitrary charsets _csname'foo'. Then, as a bit of
sugar, you get to pick which two of those charsets you'd like
to have easy shorter ways of writing, 'foo' or n'foo',
CHAR or NCHAR.

The grammar for csname is kind of funky. It can be nothing but
, which has the nice restricted form
/[A-Za-z][A-Za-z0-9_]*/. But it can also be schema-qualified,
with the schema of course being a full-fledged .

So yeah, to fully meet this part of the standard, the parser'd
have to know that
 _U&"I am a schema nameZ0021" UESCAPE 'Z'/*hi!*/.LATIN1'foo'
is a string literal, expressing foo, in a character set named
LATIN1, in some cutely-named schema.

Never a dull moment.

Regards,
-Chap




Re: Incremental View Maintenance, take 2

2023-08-27 Thread Yugo NAGATA
On Sun, 2 Jul 2023 08:25:12 +0800
jian he  wrote:

> This is probably not trivial.
> In function  apply_new_delta_with_count.
> 
>  appendStringInfo(,
> "WITH updt AS (" /* update a tuple if this exists in the view */
> "UPDATE %s AS mv SET %s = mv.%s OPERATOR(pg_catalog.+) diff.%s "
> "%s " /* SET clauses for aggregates */
> "FROM %s AS diff "
> "WHERE %s " /* tuple matching condition */
> "RETURNING %s" /* returning keys of updated tuples */
> ") INSERT INTO %s (%s)" /* insert a new tuple if this doesn't existw */
> "SELECT %s FROM %s AS diff "
> "WHERE NOT EXISTS (SELECT 1 FROM updt AS mv WHERE %s);",
> 
> -
> ") INSERT INTO %s (%s)" /* insert a new tuple if this doesn't existw */
> "SELECT %s FROM %s AS diff "
> 
> the INSERT INTO line, should have one white space in the end?
> also "existw" should be "exists"

Yes, we should need a space although it works. I'll fix as well as the typo.
Thank you.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 




Re: Incremental View Maintenance, take 2

2023-08-27 Thread Yugo NAGATA
On Fri, 30 Jun 2023 08:00:00 +0800
jian he  wrote:

> Hi there.
> in v28-0005-Add-Incremental-View-Maintenance-support-to-psql.patch
> I don't know how to set psql to get the output
> "Incremental view maintenance: yes"

This information will appear when you use "d+" command for an 
incrementally maintained materialized view.

Regards,
Yugo Nagata


-- 
Yugo NAGATA 




Re: Incremental View Maintenance, take 2

2023-08-27 Thread Yugo NAGATA
On Thu, 29 Jun 2023 18:51:06 +0800
jian he  wrote:

> I cannot build the doc.
> git clean  -fdx
> git am ~/Desktop/tmp/*.patch
> 
> Applying: Add a syntax to create Incrementally Maintainable Materialized Views
> Applying: Add relisivm column to pg_class system catalog
> Applying: Allow to prolong life span of transition tables until transaction 
> end
> Applying: Add Incremental View Maintenance support to pg_dump
> Applying: Add Incremental View Maintenance support to psql
> Applying: Add Incremental View Maintenance support
> Applying: Add DISTINCT support for IVM
> Applying: Add aggregates support in IVM
> Applying: Add support for min/max aggregates for IVM
> Applying: Add regression tests for Incremental View Maintenance
> Applying: Add documentations about Incremental View Maintenance
> .git/rebase-apply/patch:79: trailing whitespace.
>  clause.
> warning: 1 line adds whitespace errors.
> 
> Because of this, the {ninja docs} command failed. ERROR message:
> 
> [6/6] Generating doc/src/sgml/html with a custom command
> FAILED: doc/src/sgml/html
> /usr/bin/python3
> ../../Desktop/pg_sources/main/postgres/doc/src/sgml/xmltools_dep_wrapper
> --targetname doc/src/sgml/html --depfile doc/src/sgml/html.d --tool
> /usr/bin/xsltproc -- -o doc/src/sgml/ --nonet --stringparam pg.version
> 16beta2 --path doc/src/sgml --path
> ../../Desktop/pg_sources/main/postgres/doc/src/sgml
> ../../Desktop/pg_sources/main/postgres/doc/src/sgml/stylesheet.xsl
> doc/src/sgml/postgres-full.xml
> ERROR: id attribute missing on  element under /book[@id =
> 'postgres']/part[@id = 'server-programming']/chapter[@id =
> 'rules']/sect1[@id = 'rules-ivm']
> error: file doc/src/sgml/postgres-full.xml
> xsltRunStylesheet : run failed
> ninja: build stopped: subcommand failed.

Thank your for pointing out this.

I'll add ids for all sections to suppress the errors.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 




Re: Incremental View Maintenance, take 2

2023-08-27 Thread Yugo NAGATA
On Thu, 29 Jun 2023 18:20:32 +0800
jian he  wrote:

> On Thu, Jun 29, 2023 at 12:40 AM jian he  wrote:
> >
> > On Wed, Jun 28, 2023 at 4:06 PM Yugo NAGATA  wrote:
> > >
> > > On Wed, 28 Jun 2023 00:01:02 +0800
> > > jian he  wrote:
> > >
> > > > On Thu, Jun 1, 2023 at 2:47 AM Yugo NAGATA  wrote:
> > > > >
> > > > > On Thu, 1 Jun 2023 23:59:09 +0900
> > > > > Yugo NAGATA  wrote:
> > > > >
> > > > > > Hello hackers,
> > > > > >
> > > > > > Here's a rebased version of the patch-set adding Incremental View
> > > > > > Maintenance support for PostgreSQL. That was discussed in [1].
> > > > >
> > > > > > [1] 
> > > > > > https://www.postgresql.org/message-id/flat/20181227215726.4d166b4874f8983a641123f5%40sraoss.co.jp
> > > > >
> > > > > ---
> > > > > * Overview
> > > > >
> > > > > Incremental View Maintenance (IVM) is a way to make materialized views
> > > > > up-to-date by computing only incremental changes and applying them on
> > > > > views. IVM is more efficient than REFRESH MATERIALIZED VIEW when
> > > > > only small parts of the view are changed.
> > > > >
> > > > > ** Feature
> > > > >
> > > > > The attached patchset provides a feature that allows materialized 
> > > > > views
> > > > > to be updated automatically and incrementally just after a underlying
> > > > > table is modified.
> > > > >
> > > > > You can create an incementally maintainable materialized view (IMMV)
> > > > > by using CREATE INCREMENTAL MATERIALIZED VIEW command.
> > > > >
> > > > > The followings are supported in view definition queries:
> > > > > - SELECT ... FROM ... WHERE ..., joins (inner joins, self-joins)
> > > > > - some built-in aggregate functions (count, sum, avg, min, max)
> > > > > - GROUP BY clause
> > > > > - DISTINCT clause
> > > > >
> > > > > Views can contain multiple tuples with the same content (duplicate 
> > > > > tuples).
> > > > >
> > > > > ** Restriction
> > > > >
> > > > > The following are not supported in a view definition:
> > > > > - Outer joins
> > > > > - Aggregates otehr than above, window functions, HAVING
> > > > > - Sub-queries, CTEs
> > > > > - Set operations (UNION, INTERSECT, EXCEPT)
> > > > > - DISTINCT ON, ORDER BY, LIMIT, OFFSET
> > > > >
> > > > > Also, a view definition query cannot contain other views, 
> > > > > materialized views,
> > > > > foreign tables, partitioned tables, partitions, VALUES, non-immutable 
> > > > > functions,
> > > > > system columns, or expressions that contains aggregates.
> > > > >
> > > > > ---
> > > > > * Design
> > > > >
> > > > > An IMMV is maintained using statement-level AFTER triggers.
> > > > > When an IMMV is created, triggers are automatically created on all 
> > > > > base
> > > > > tables contained in the view definition query.
> > > > >
> > > > > When a table is modified, changes that occurred in the table are 
> > > > > extracted
> > > > > as transition tables in the AFTER triggers. Then, changes that will 
> > > > > occur in
> > > > > the view are calculated by a rewritten view dequery in which the 
> > > > > modified table
> > > > > is replaced with the transition table.
> > > > >
> > > > > For example, if the view is defined as "SELECT * FROM R, S", and 
> > > > > tuples inserted
> > > > > into R are stored in a transiton table dR, the tuples that will be 
> > > > > inserted into
> > > > > the view are calculated as the result of "SELECT * FROM dR, S".
> > > > >
> > > > > ** Multiple Tables Modification
> > > > >
> > > > > Multiple tables can be modified in a statement when using triggers, 
> > > > > foreign key
> > > > > constraint, or modifying CTEs. When multiple tables are modified, we 
> > > > > need
> > > > > the state of tables before the modification.
> > > > >
> > > > > For example, when some tuples, dR and dS, are inserted into R and S 
> > > > > respectively,
> > > > > the tuples that will be inserted into the view are calculated by the 
> > > > > following
> > > > > two queries:
> > > > >
> > > > >  "SELECT * FROM dR, S_pre"
> > > > >  "SELECT * FROM R, dS"
> > > > >
> > > > > where S_pre is the table before the modification, R is the current 
> > > > > state of
> > > > > table, that is, after the modification. This pre-update states of 
> > > > > table
> > > > > is calculated by filtering inserted tuples and appending deleted 
> > > > > tuples.
> > > > > The subquery that represents pre-update state is generated in 
> > > > > get_prestate_rte().
> > > > > Specifically, the insterted tuples are filtered by calling 
> > > > > IVM_visible_in_prestate()
> > > > > in WHERE clause. This function checks the visibility of tuples by 
> > > > > using
> > > > > the snapshot taken before table modification. The deleted tuples are 
> > > > > contained
> > > > > in the old transition table, and this table is appended using UNION 
> > > > > ALL.
> > > > >
> > > > > Transition tables 

Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-08-27 Thread Joe Conway

On 8/15/23 10:40, Heikki Linnakangas wrote:

On 01/08/2023 16:48, Joe Conway wrote:

Any further comments on the posted patch[1]? I would like to apply/push
this prior to the beta and minor releases next week.


I'm not sure about the placement of the uselocale() calls. In
plperl_spi_exec(), for example, I think we should switch to the global
locale right after the check_spi_usage_allowed() call. Otherwise, if an
error happens in BeginInternalSubTransaction() or in pg_verifymbstr(),
the error would be processed with the perl locale. Maybe that's
harmless, error processing hardly cares about LC_MONETARY, but seems
wrong in principle.


I guess you could probably argue that we should flip this around, and 
only enter the perl locale when calling into libperl, and exit the perl 
locale every time we reemerge under plperl.c control. That seems pretty 
drastic and potentially messy though.



Hmm, come to think of it, if BeginInternalSubTransaction() throws an
error, we just jump out of the perl interpreter? That doesn't seem cool.
But that's not new with this patch.


Hmm, true enough I guess.


If I'm reading correctly, compile_plperl_function() calls
select_perl_context(), which calls plperl_trusted_init(), which calls
uselocale(). So it leaves locale set to the perl locale. Who sets it back?


No one does it seems, at least not currently


How about adding a small wrapper around eval_pl() that sets and unsets
the locale(), just when we enter the interpreter? It's easier to see
that we are doing the calls in right places, if we make them as close as
possible to entering/exiting the interpreter. Are there other functions
in addition to eval_pl() that need to be called with the perl locale?


I can see that as a better strategy, but "other functions in addition to 
eval_pv()" (I assume you mean eval_pv rather than eval_pl) is a tricky 
one to answer.


I ran the attached script like so (from cwd src/pl/plperl) like so:
```
symbols-used.sh /lib/x86_64-linux-gnu/libperl.so.5.34 plperl.so
```
and get a fairly long list of exported libperl functions that get linked 
into plperl.so:


```
Matched symbols:
boot_DynaLoader
perl_alloc
Perl_av_extend
Perl_av_fetch
Perl_av_len
Perl_av_push
*Perl_call_list
*Perl_call_pv
*Perl_call_sv
perl_construct
Perl_croak
Perl_croak_nocontext
Perl_croak_sv
Perl_croak_xs_usage
Perl_die
*Perl_eval_pv
Perl_free_tmps
Perl_get_sv
Perl_gv_add_by_type
Perl_gv_stashpv
Perl_hv_clear
Perl_hv_common
Perl_hv_common_key_len
Perl_hv_iterinit
Perl_hv_iternext
Perl_hv_iternext_flags
Perl_hv_iternextsv
Perl_hv_ksplit
Perl_looks_like_number
Perl_markstack_grow
Perl_mg_get
Perl_newRV
Perl_newRV_noinc
Perl_newSV
Perl_newSViv
Perl_newSVpv
Perl_newSVpvn
Perl_newSVpvn_flags
Perl_newSVsv
Perl_newSVsv_flags
Perl_newSV_type
Perl_newSVuv
Perl_newXS
Perl_newXS_flags
*perl_parse
Perl_pop_scope
Perl_push_scope
*perl_run
Perl_save_item
Perl_savetmps
Perl_stack_grow
Perl_sv_2bool
Perl_sv_2bool_flags
Perl_sv_2iv
Perl_sv_2iv_flags
Perl_sv_2mortal
Perl_sv_2pv
Perl_sv_2pvbyte
Perl_sv_2pvbyte_flags
Perl_sv_2pv_flags
Perl_sv_2pvutf8
Perl_sv_2pvutf8_flags
Perl_sv_bless
Perl_sv_free
Perl_sv_free2
Perl_sv_isa
Perl_sv_newmortal
Perl_sv_setiv
Perl_sv_setiv_mg
Perl_sv_setsv
Perl_sv_setsv_flags
Perl_sys_init
Perl_sys_init3
Perl_xs_boot_epilog
Perl_xs_handshake
```

I marked the ones that look like perhaps we should care about in the 
above list with an asterisk:


*Perl_call_list
*Perl_call_pv
*Perl_call_sv
*Perl_eval_pv
*perl_run

but perhaps there are others?


/*
 * plperl_xact_callback --- cleanup at main-transaction end.
 */
static void
plperl_xact_callback(XactEvent event, void *arg)
{
/* ensure global locale is the current locale */
if (uselocale((locale_t) 0) != LC_GLOBAL_LOCALE)
perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
}


So the assumption is that the if current locale is not LC_GLOBAL_LOCALE,
then it was the perl locale. Seems true today, but this could confusion
if anything else calls uselocale(). In particular, if another PL
implementation copies this, and you use plperl and the other PL at the
same time, they would get mixed up. I think we need another "bool
perl_locale_obj_in_use" variable to track explicitly whether the perl
locale is currently active.


Or perhaps don't assume that we want the global locale and swap between 
pg_locale_obj (whatever it is) and perl_locale_obj?



If we are careful to put the uselocale() calls in the right places so
that we never ereport() while in perl locale, this callback isn't
needed. Maybe it's still a good idea, though, to be extra sure that
things get reset to a sane state if something unexpected happens.


I feel more comfortable that we have a "belt and suspenders" method to 
restore the locale that was in use by Postgres before entering perl.



If multiple interpreters are used, is the single perl_locale_obj
variable still enough? Each interpreter can have their own locale I believe.


So in other words plperl and plperlu both 

Re: Incremental View Maintenance, take 2

2023-08-27 Thread Yugo NAGATA
On Thu, 29 Jun 2023 00:40:45 +0800
jian he  wrote:

> On Wed, Jun 28, 2023 at 4:06 PM Yugo NAGATA  wrote:
> >
> > On Wed, 28 Jun 2023 00:01:02 +0800
> > jian he  wrote:
> >
> > > On Thu, Jun 1, 2023 at 2:47 AM Yugo NAGATA  wrote:
> > > >
> > > > On Thu, 1 Jun 2023 23:59:09 +0900
> > > > Yugo NAGATA  wrote:
> > > >
> > > > > Hello hackers,
> > > > >
> > > > > Here's a rebased version of the patch-set adding Incremental View
> > > > > Maintenance support for PostgreSQL. That was discussed in [1].
> > > >
> > > > > [1] 
> > > > > https://www.postgresql.org/message-id/flat/20181227215726.4d166b4874f8983a641123f5%40sraoss.co.jp
> > > >
> > > > ---
> > > > * Overview
> > > >
> > > > Incremental View Maintenance (IVM) is a way to make materialized views
> > > > up-to-date by computing only incremental changes and applying them on
> > > > views. IVM is more efficient than REFRESH MATERIALIZED VIEW when
> > > > only small parts of the view are changed.
> > > >
> > > > ** Feature
> > > >
> > > > The attached patchset provides a feature that allows materialized views
> > > > to be updated automatically and incrementally just after a underlying
> > > > table is modified.
> > > >
> > > > You can create an incementally maintainable materialized view (IMMV)
> > > > by using CREATE INCREMENTAL MATERIALIZED VIEW command.
> > > >
> > > > The followings are supported in view definition queries:
> > > > - SELECT ... FROM ... WHERE ..., joins (inner joins, self-joins)
> > > > - some built-in aggregate functions (count, sum, avg, min, max)
> > > > - GROUP BY clause
> > > > - DISTINCT clause
> > > >
> > > > Views can contain multiple tuples with the same content (duplicate 
> > > > tuples).
> > > >
> > > > ** Restriction
> > > >
> > > > The following are not supported in a view definition:
> > > > - Outer joins
> > > > - Aggregates otehr than above, window functions, HAVING
> > > > - Sub-queries, CTEs
> > > > - Set operations (UNION, INTERSECT, EXCEPT)
> > > > - DISTINCT ON, ORDER BY, LIMIT, OFFSET
> > > >
> > > > Also, a view definition query cannot contain other views, materialized 
> > > > views,
> > > > foreign tables, partitioned tables, partitions, VALUES, non-immutable 
> > > > functions,
> > > > system columns, or expressions that contains aggregates.
> > > >
> > > > ---
> > > > * Design
> > > >
> > > > An IMMV is maintained using statement-level AFTER triggers.
> > > > When an IMMV is created, triggers are automatically created on all base
> > > > tables contained in the view definition query.
> > > >
> > > > When a table is modified, changes that occurred in the table are 
> > > > extracted
> > > > as transition tables in the AFTER triggers. Then, changes that will 
> > > > occur in
> > > > the view are calculated by a rewritten view dequery in which the 
> > > > modified table
> > > > is replaced with the transition table.
> > > >
> > > > For example, if the view is defined as "SELECT * FROM R, S", and tuples 
> > > > inserted
> > > > into R are stored in a transiton table dR, the tuples that will be 
> > > > inserted into
> > > > the view are calculated as the result of "SELECT * FROM dR, S".
> > > >
> > > > ** Multiple Tables Modification
> > > >
> > > > Multiple tables can be modified in a statement when using triggers, 
> > > > foreign key
> > > > constraint, or modifying CTEs. When multiple tables are modified, we 
> > > > need
> > > > the state of tables before the modification.
> > > >
> > > > For example, when some tuples, dR and dS, are inserted into R and S 
> > > > respectively,
> > > > the tuples that will be inserted into the view are calculated by the 
> > > > following
> > > > two queries:
> > > >
> > > >  "SELECT * FROM dR, S_pre"
> > > >  "SELECT * FROM R, dS"
> > > >
> > > > where S_pre is the table before the modification, R is the current 
> > > > state of
> > > > table, that is, after the modification. This pre-update states of table
> > > > is calculated by filtering inserted tuples and appending deleted tuples.
> > > > The subquery that represents pre-update state is generated in 
> > > > get_prestate_rte().
> > > > Specifically, the insterted tuples are filtered by calling 
> > > > IVM_visible_in_prestate()
> > > > in WHERE clause. This function checks the visibility of tuples by using
> > > > the snapshot taken before table modification. The deleted tuples are 
> > > > contained
> > > > in the old transition table, and this table is appended using UNION ALL.
> > > >
> > > > Transition tables for each modification are collected in each AFTER 
> > > > trigger
> > > > function call. Then, the view maintenance is performed in the last call 
> > > > of
> > > > the trigger.
> > > >
> > > > In the original PostgreSQL, tuplestores of transition tables are freed 
> > > > at the
> > > > end of each nested query. However, their lifespan 

Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-08-27 Thread Masahiko Sawada
On Wed, Aug 16, 2023 at 8:04 PM John Naylor
 wrote:
>
>
> On Tue, Aug 15, 2023 at 6:53 PM John Naylor  
> wrote:
> >
> > On Tue, Aug 15, 2023 at 9:34 AM Masahiko Sawada  
> > wrote:
> >
> > > BTW cfbot reported that some regression tests failed due to OOM. I've
> > > attached the patch to fix it.
> >
> > Seems worth doing now rather than later, so added this and squashed most of 
> > the rest together.
>
> This segfaults because of a mistake fixing a rebase conflict, so v40 attached.
>

Thank you for updating the patch set.

On Tue, Aug 15, 2023 at 11:33 AM Masahiko Sawada  wrote:
> On Mon, Aug 14, 2023 at 8:05 PM John Naylor
>  wrote:
> > Looking at the tidstore tests again after some months, I'm not particularly 
> > pleased with the amount of code required for how little it seems to be 
> > testing, nor the output when something fails. (I wonder how hard it would 
> > be to have SQL functions that add blocks/offsets to the tid store, and emit 
> > tuples of tids found in the store.)
>
> It would not be hard to have such SQL functions. I'll try it.

I've updated the regression tests for tidstore so that it uses SQL
functions to add blocks/offsets and dump its contents. The new test
covers the same test coverages but it's executed using SQL functions
instead of executing all tests in one SQL function.

0008 patch fixes a bug in tidstore which I found during this work. We
didn't recreate the radix tree in the same memory context when
TidStoreReset().

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


v41-ART.tar.gz
Description: GNU Zip compressed data


RE: logical_replication_mode

2023-08-27 Thread Zhijie Hou (Fujitsu)
On Friday, August 25, 2023 5:56 PM Amit Kapila  wrote:
> 
> On Fri, Aug 25, 2023 at 12:38 PM Peter Eisentraut  
> wrote:
> >
> > On 25.08.23 08:52, Zhijie Hou (Fujitsu) wrote:
> > > On Friday, August 25, 2023 12:28 PM Amit Kapila
>  wrote:
> > >>
> > >> On Thu, Aug 24, 2023 at 12:45 PM Peter Eisentraut
> > >> 
> > >> wrote:
> > >>>
> > >>> I suggest we rename this setting to something starting with debug_.
> > >>> Right now, the name looks much too tempting for users to fiddle with.
> > >>> I think this is similar to force_parallel_mode.
> > >>>
> > >>
> > >> +1. How about debug_logical_replication?
> > >>
> > >>> Also, the descriptions in guc_tables.c could be improved.  For
> > >>> example,
> > >>>
> > >>>   gettext_noop("Controls when to replicate or apply each
> > >>> change."),
> > >>>
> > >>> is pretty content-free and unhelpful.
> > >>>
> > >>
> > >> The other possibility I could think of is to change short_desc as:
> > >> "Allows to replicate each change for large transactions.". Do you
> > >> have any better ideas?
> > >
> > > How about "Forces immediate streaming or serialization of changes in
> > > large transactions." which is similar to the description in document.
> > >
> > > I agree that renaming it to debug_xx would be better and here is a
> > > patch that tries to do this.
> >
> > Maybe debug_logical_replication is too general?  Something like
> > debug_logical_replication_streaming would be more concrete.
> >
> 
> Yeah, that sounds better.

OK, here is the debug_logical_replication_streaming version.

Best Regards,
Hou zj


0001-Rename-logical_replication_mode-to-debug_logical_rep.patch
Description:  0001-Rename-logical_replication_mode-to-debug_logical_rep.patch


[PoC] Implementation of distinct in Window Aggregates: take two

2023-08-27 Thread Ankit Pandey
Hi,

This is reopening of thread:
https://www.postgresql.org/message-id/flat/2ef6b491-1946-b606-f064-d9ea79d91463%40gmail.com#14e0bdb6872c0b26023d532eeb943d3e

This is a PoC patch which implements distinct operation in window
aggregates (without order by and for single column aggregation, final
version may vary wrt these limitations). Purpose of this PoC is to get
feedback on the approach used and corresponding implementation, any
nitpicking as deemed reasonable.

Distinct operation is mirrored from implementation in nodeAgg. Existing
partitioning logic determines if row is in partition and when distinct is
required, all tuples for the aggregate column are stored in tuplesort. When
finalize_windowaggregate gets called, tuples are sorted and duplicates are
removed, followed by calling the transition function on each tuple.
When distinct is not required, the above process is skipped and the
transition function gets called directly and nothing gets inserted into
tuplesort.
Note: For each partition, in tuplesort_begin and tuplesort_end is involved
to rinse tuplesort, so at any time, max tuples in tuplesort is equal to
tuples in a particular partition.

I have verified it for interger and interval column aggregates (to rule out
obvious issues related to data types).

Sample cases:

create table mytable(id int, name text);
insert into mytable values(1, 'A');
insert into mytable values(1, 'A');
insert into mytable values(5, 'B');
insert into mytable values(3, 'A');
insert into mytable values(1, 'A');

select avg(distinct id) over (partition by name) from mytable;
avg

2.
2.
2.
2.
5.

select avg(id) over (partition by name) from mytable;
avg

 1.5000
 1.5000
 1.5000
 1.5000
 5.

select avg(distinct id) over () from mytable;
avg

 3.
 3.
 3.
 3.
 3.

select avg(distinct id)  from mytable;
avg

 3.

This is my first-time contribution. Please let me know if anything can be
improved as I`m eager to learn.

Regards,
Ankit Kumar Pandey
From d7b5face40794f93c6134aedb57c85cf0f077780 Mon Sep 17 00:00:00 2001
From: Ankit Kumar Pandey 
Date: Wed, 23 Nov 2022 00:38:01 +0530
Subject: [PATCH] 3WIP commit

allow distinct in windows function
---
 src/backend/executor/nodeWindowAgg.c | 211 +++
 src/backend/optimizer/util/clauses.c |   2 +
 src/backend/parser/parse_agg.c   |  45 ++
 src/backend/parser/parse_func.c  |  20 +--
 src/include/nodes/execnodes.h|   1 +
 src/include/nodes/primnodes.h|   2 +
 src/test/regress/expected/window.out |  16 ++
 src/test/regress/sql/window.sql  |   3 +
 8 files changed, 260 insertions(+), 40 deletions(-)

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 310ac23e3a..b396f19e05 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -155,6 +155,13 @@ typedef struct WindowStatePerAggData
 
 	int64		transValueCount;	/* number of currently-aggregated rows */
 
+	Datum		lastdatum;		/* used for single-column DISTINCT */
+	FmgrInfo	equalfnOne; /* single-column comparisons*/
+	Oid			*eq_ops;
+	Oid			*sort_ops;
+
+	bool sort_in;
+
 	/* Data local to eval_windowaggregates() */
 	bool		restart;		/* need to restart this agg in this cycle? */
 } WindowStatePerAggData;
@@ -164,7 +171,7 @@ static void initialize_windowaggregate(WindowAggState *winstate,
 	   WindowStatePerAgg peraggstate);
 static void advance_windowaggregate(WindowAggState *winstate,
 	WindowStatePerFunc perfuncstate,
-	WindowStatePerAgg peraggstate);
+	WindowStatePerAgg peraggstate, Datum value, bool isNull);
 static bool advance_windowaggregate_base(WindowAggState *winstate,
 		 WindowStatePerFunc perfuncstate,
 		 WindowStatePerAgg peraggstate);
@@ -174,6 +181,9 @@ static void finalize_windowaggregate(WindowAggState *winstate,
 	 Datum *result, bool *isnull);
 
 static void eval_windowaggregates(WindowAggState *winstate);
+static void process_ordered_aggregate_single(WindowAggState *winstate, 
+			 WindowStatePerFunc perfuncstate,
+ 			 WindowStatePerAgg peraggstate);
 static void eval_windowfunction(WindowAggState *winstate,
 WindowStatePerFunc perfuncstate,
 Datum *result, bool *isnull);
@@ -231,6 +241,7 @@ initialize_windowaggregate(WindowAggState *winstate,
 	peraggstate->transValueIsNull = peraggstate->initValueIsNull;
 	peraggstate->transValueCount = 0;
 	peraggstate->resultValue = (Datum) 0;
+	peraggstate->lastdatum = (Datum) 0;
 	peraggstate->resultValueIsNull = true;
 }
 
@@ -241,43 +252,21 @@ initialize_windowaggregate(WindowAggState *winstate,
 static void
 

Re: Make all Perl warnings fatal

2023-08-27 Thread Andrew Dunstan


On 2023-08-25 Fr 16:49, Dagfinn Ilmari Mannsåker wrote:

Alvaro Herrera  writes:


On 2023-Aug-10, Peter Eisentraut wrote:


I wanted to figure put if we can catch these more reliably, in the style of
-Werror.  AFAICT, there is no way to automatically turn all warnings into
fatal errors.  But there is a way to do it per script, by replacing

 use warnings;

by

 use warnings FATAL => 'all';

See attached patch to try it out.

BTW in case we do find that there's some unforeseen problem and we want
to roll back, it would be great to have a way to disable this without
having to edit every single Perl file again later.  However, I didn't
find a way to do it -- I thought about creating a separate PgWarnings.pm
file that would do the "use warnings FATAL => 'all'" dance and which
every other Perl file would use or include; but couldn't make it work.
Maybe some Perl expert knows a good answer to this.

Like most pragmas (all-lower-case module names), `warnings` affects the
currently-compiling lexical scope, so to have a module like PgWarnings
inject it into the module that uses it, you'd call warnings->import in
its import method (which gets called when the `use PgWarnings;``
statement is compiled, e.g.:

package PgWarnings;

sub import {
warnings->import(FATAL => 'all');
}

I wouldn't bother with a whole module just for that, but if we have a
group of pragmas or modules we always want to enable/import and have the
ability to change this set without having to edit all the files, it's
quite common to have a ProjectName::Policy module that does that.  For
example, to exclude warnings that are unsafe, pointless, or impossible
to fatalise (c.f.https://metacpan.org/pod/strictures#CATEGORY-SELECTIONS):

package PostgreSQL::Policy;

sub import {
strict->import;
warnings->import(
FATAL => 'all',
NONFATAL => qw(exec internal malloc recursion),
);
warnings->uniport(qw(once));
}

Now that we require Perl 5.14, we might want to consider enabling its
feature bundle as well, with:

feature->import(':5.14')

Although the only features of note that adds are:

- say: the `say` function, like `print` but appends a newline

- state: `state` variables, like `my` but only initialised the first
  time the function they're in is called, and the value persists
  between calls (like function-scoped `static` variables in C)

- unicode_strings: use unicode semantics for characters in the
  128-255 range, regardless of internal representation



We'd probably have to modify the perlcritic rules to account for it. See 
 
and similarly for RequireUseWarnings. In any case, it seems a bit like 
overkill.




Maybe the BEGIN block of each file can `eval` a new PgWarnings.pm that
emits the "use warnings" line?

That's ugly as sin, and thankfully not necessary.



Agreed.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Ignore 2PC transaction GIDs in query jumbling

2023-08-27 Thread Michael Paquier
On Sat, Aug 19, 2023 at 01:47:48PM +0900, Michael Paquier wrote:
> On Fri, Aug 18, 2023 at 11:31:03AM +0100, Dagfinn Ilmari Mannsåker wrote:
>> I don't have a particularly strong opinion on whether we should
>> distinguish DEALLOCATE ALL from DEALLOCATE  (call it +0.5), but
> 
> The difference looks important to me, especially for monitoring.

After thinking a few days about it, I'd rather still make the
difference between both, so applied this way.

> And pgbouncer may also use both of them, actually?  (Somebody, please
> correct me here if necessary.)

I cannot see any signs of that in pgbouncer, btw.
--
Michael


signature.asc
Description: PGP signature