Re: Perform streaming logical transactions by background workers and parallel apply

2023-05-07 Thread Masahiko Sawada
On Mon, May 8, 2023 at 12:52 PM Zhijie Hou (Fujitsu)
 wrote:
>
> On Monday, May 8, 2023 11:08 AM Masahiko Sawada 
>
> Hi,
>
> >
> > On Tue, May 2, 2023 at 12:22 PM Amit Kapila 
> > wrote:
> > >
> > > On Fri, Apr 28, 2023 at 11:48 AM Masahiko Sawada
> >  wrote:
> > > >
> > > > While investigating this issue, I've reviewed the code around
> > > > callbacks and worker termination etc and I found a problem.
> > > >
> > > > A parallel apply worker calls the before_shmem_exit callbacks in the
> > > > following order:
> > > >
> > > > 1. ShutdownPostgres()
> > > > 2. logicalrep_worker_onexit()
> > > > 3. pa_shutdown()
> > > >
> > > > Since the worker is detached during logicalrep_worker_onexit(),
> > > > MyLogicalReplication->leader_pid is an invalid when we call
> > > > pa_shutdown():
> > > >
> > > > static void
> > > > pa_shutdown(int code, Datum arg)
> > > > {
> > > > Assert(MyLogicalRepWorker->leader_pid != InvalidPid);
> > > > SendProcSignal(MyLogicalRepWorker->leader_pid,
> > > >PROCSIG_PARALLEL_APPLY_MESSAGE,
> > > >InvalidBackendId);
> > > >
> > > > Also, if the parallel apply worker fails shm_toc_lookup() during the
> > > > initialization, it raises an error (because of noError = false) but
> > > > ends up a SEGV as MyLogicalRepWorker is still NULL.
> > > >
> > > > I think that we should not use MyLogicalRepWorker->leader_pid in
> > > > pa_shutdown() but instead store the leader's pid to a static variable
> > > > before registering pa_shutdown() callback.
> > > >
> > >
> > > Why not simply move the registration of pa_shutdown() to someplace
> > > after logicalrep_worker_attach()?
> >
> > If we do that, the worker won't call dsm_detach() if it raises an
> > ERROR in logicalrep_worker_attach(), is that okay? It seems that it's
> > no practically problem since we call dsm_backend_shutdown() in
> > shmem_exit(), but if so why do we need to call it in pa_shutdown()?
>
> I think the dsm_detach in pa_shutdown was intended to fire on_dsm_detach
> callbacks to give callback a chance to report stat before the stat system is
> shutdown, following what we do in ParallelWorkerShutdown() (e.g.
> sharedfileset.c callbacks cause fd.c to do ReportTemporaryFileUsage(), so we
> need to fire that earlier).
>
> But for parallel apply, we currently only have one on_dsm_detach
> callback(shm_mq_detach_callback) which doesn't report extra stats. So the
> dsm_detach in pa_shutdown is only used to make it a bit future-proof in case
> we add some other on_dsm_detach callbacks in the future which need to report
> stats.

Make sense . Given that it's possible that we add other callbacks that
report stats in the future, I think it's better not to move the
position to register pa_shutdown() callback.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: PGDOCS - Replica Identity quotes

2023-05-07 Thread Michael Paquier
On Mon, May 08, 2023 at 01:57:33PM +1000, Peter Smith wrote:
> I agree. Do you want me to make a new v4 patch to also do that, or
> will you handle them in passing?

I'll just go handle them on the way, no need to send an updated
patch.
--
Michael


signature.asc
Description: PGP signature


Re: PGDOCS - Replica Identity quotes

2023-05-07 Thread Peter Smith
On Mon, May 8, 2023 at 1:09 PM Michael Paquier  wrote:
>
> On Mon, May 08, 2023 at 10:29:50AM +1000, Peter Smith wrote:
> > Thanks for giving some feedback on my patch.
>
> Looks OK.
>
> While on it, looking at logical-replication.sgml, it seems to me that
> these two are also incorrect, and we should use  markups:
> implemented by walsender and apply
> --

I agree. Do you want me to make a new v4 patch to also do that, or
will you handle them in passing?

--
Kind Regards,
Peter Smith.
Fujitsu Australia




RE: Perform streaming logical transactions by background workers and parallel apply

2023-05-07 Thread Zhijie Hou (Fujitsu)
On Monday, May 8, 2023 11:08 AM Masahiko Sawada 

Hi,

> 
> On Tue, May 2, 2023 at 12:22 PM Amit Kapila 
> wrote:
> >
> > On Fri, Apr 28, 2023 at 11:48 AM Masahiko Sawada
>  wrote:
> > >
> > > While investigating this issue, I've reviewed the code around
> > > callbacks and worker termination etc and I found a problem.
> > >
> > > A parallel apply worker calls the before_shmem_exit callbacks in the
> > > following order:
> > >
> > > 1. ShutdownPostgres()
> > > 2. logicalrep_worker_onexit()
> > > 3. pa_shutdown()
> > >
> > > Since the worker is detached during logicalrep_worker_onexit(),
> > > MyLogicalReplication->leader_pid is an invalid when we call
> > > pa_shutdown():
> > >
> > > static void
> > > pa_shutdown(int code, Datum arg)
> > > {
> > > Assert(MyLogicalRepWorker->leader_pid != InvalidPid);
> > > SendProcSignal(MyLogicalRepWorker->leader_pid,
> > >PROCSIG_PARALLEL_APPLY_MESSAGE,
> > >InvalidBackendId);
> > >
> > > Also, if the parallel apply worker fails shm_toc_lookup() during the
> > > initialization, it raises an error (because of noError = false) but
> > > ends up a SEGV as MyLogicalRepWorker is still NULL.
> > >
> > > I think that we should not use MyLogicalRepWorker->leader_pid in
> > > pa_shutdown() but instead store the leader's pid to a static variable
> > > before registering pa_shutdown() callback.
> > >
> >
> > Why not simply move the registration of pa_shutdown() to someplace
> > after logicalrep_worker_attach()?
> 
> If we do that, the worker won't call dsm_detach() if it raises an
> ERROR in logicalrep_worker_attach(), is that okay? It seems that it's
> no practically problem since we call dsm_backend_shutdown() in
> shmem_exit(), but if so why do we need to call it in pa_shutdown()?

I think the dsm_detach in pa_shutdown was intended to fire on_dsm_detach
callbacks to give callback a chance to report stat before the stat system is
shutdown, following what we do in ParallelWorkerShutdown() (e.g.
sharedfileset.c callbacks cause fd.c to do ReportTemporaryFileUsage(), so we
need to fire that earlier).

But for parallel apply, we currently only have one on_dsm_detach
callback(shm_mq_detach_callback) which doesn't report extra stats. So the
dsm_detach in pa_shutdown is only used to make it a bit future-proof in case
we add some other on_dsm_detach callbacks in the future which need to report
stats.

Best regards,
Hou zj



Re: [DOC] Update ALTER SUBSCRIPTION documentation

2023-05-07 Thread Peter Smith
On Fri, May 5, 2023 at 11:17 PM Robert Sjöblom
 wrote:
>
>
> Hi,
>
> We have recently used the PostgreSQL documentation when setting up our
> logical replication. We noticed there was a step missing in the
> documentation on how to drop a logical replication subscription with a
> replication slot attached.
>
> We clarify the documentation to include prerequisites for running the
> DROP SUBSCRIPTION command. Please see attached patch.

Right, there is a "missing step" in the documentation, but OTOH that
step is going to be obvious from the error you get when attempting to
set the slot_name to NONE:

e.g.
test_sub=# ALTER SUBSCRIPTION sub1 SET (slot_name= NONE);
ERROR:  cannot set slot_name = NONE for enabled subscription

~

IMO this scenario is sort of a trade-off between (a) wanting to give
every little step explicitly versus (b) trying to keep the
documentation free of clutter.

I think a comprise here is just to mention the need for disabling the
subscription but without spelling out the details of the ALTER ...
DISABLE command.

For example,

BEFORE
To proceed in this situation, disassociate the subscription from the
replication slot by executing ALTER SUBSCRIPTION ... SET (slot_name =
NONE).

SUGGESTION
To proceed in this situation, first DISABLE the subscription, and then
disassociate it from the replication slot by executing ALTER
SUBSCRIPTION ... SET (slot_name = NONE).

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Add PQsendSyncMessage() to libpq

2023-05-07 Thread Michael Paquier
On Wed, May 03, 2023 at 12:03:57PM +0200, Alvaro Herrera wrote:
> We already have 'int' flag masks in PQcopyResult() and
> PQsetTraceFlags().  We were using bits32 initially for flag stuff in the
> PQtrace facilities, until [1] reminded us that we shouldn't let c.h
> creep into app-land, so that was turned into plain 'int'.
> 
> [1] 
> https://www.postgresql.org/message-id/TYAPR01MB2990B6C6A32ACF15D97AE94AFEBD0%40TYAPR01MB2990.jpnprd01.prod.outlook.com

Indeed.  Good point!
--
Michael


signature.asc
Description: PGP signature


Re: PGDOCS - Replica Identity quotes

2023-05-07 Thread Michael Paquier
On Mon, May 08, 2023 at 10:29:50AM +1000, Peter Smith wrote:
> Thanks for giving some feedback on my patch.

Looks OK.

While on it, looking at logical-replication.sgml, it seems to me that
these two are also incorrect, and we should use  markups:
implemented by walsender and apply
--
Michael


signature.asc
Description: PGP signature


Re: Perform streaming logical transactions by background workers and parallel apply

2023-05-07 Thread Masahiko Sawada
On Tue, May 2, 2023 at 12:22 PM Amit Kapila  wrote:
>
> On Fri, Apr 28, 2023 at 11:48 AM Masahiko Sawada  
> wrote:
> >
> > While investigating this issue, I've reviewed the code around
> > callbacks and worker termination etc and I found a problem.
> >
> > A parallel apply worker calls the before_shmem_exit callbacks in the
> > following order:
> >
> > 1. ShutdownPostgres()
> > 2. logicalrep_worker_onexit()
> > 3. pa_shutdown()
> >
> > Since the worker is detached during logicalrep_worker_onexit(),
> > MyLogicalReplication->leader_pid is an invalid when we call
> > pa_shutdown():
> >
> > static void
> > pa_shutdown(int code, Datum arg)
> > {
> > Assert(MyLogicalRepWorker->leader_pid != InvalidPid);
> > SendProcSignal(MyLogicalRepWorker->leader_pid,
> >PROCSIG_PARALLEL_APPLY_MESSAGE,
> >InvalidBackendId);
> >
> > Also, if the parallel apply worker fails shm_toc_lookup() during the
> > initialization, it raises an error (because of noError = false) but
> > ends up a SEGV as MyLogicalRepWorker is still NULL.
> >
> > I think that we should not use MyLogicalRepWorker->leader_pid in
> > pa_shutdown() but instead store the leader's pid to a static variable
> > before registering pa_shutdown() callback.
> >
>
> Why not simply move the registration of pa_shutdown() to someplace
> after logicalrep_worker_attach()?

If we do that, the worker won't call dsm_detach() if it raises an
ERROR in logicalrep_worker_attach(), is that okay? It seems that it's
no practically problem since we call dsm_backend_shutdown() in
shmem_exit(), but if so why do we need to call it in pa_shutdown()?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: [DOC] Update ALTER SUBSCRIPTION documentation

2023-05-07 Thread Amit Kapila
On Fri, May 5, 2023 at 6:47 PM Robert Sjöblom  wrote:
>
> We have recently used the PostgreSQL documentation when setting up our
> logical replication. We noticed there was a step missing in the
> documentation on how to drop a logical replication subscription with a
> replication slot attached.
>
> We clarify the documentation to include prerequisites for running the
> DROP SUBSCRIPTION command. Please see attached patch.
>

Shouldn't we also change the following errhint in the code as well?
ReportSlotConnectionError()
{
...
ereport(ERROR,
(errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("could not connect to publisher when attempting to drop
replication slot \"%s\": %s",
slotname, err),
/* translator: %s is an SQL ALTER command */
errhint("Use %s to disassociate the subscription from the slot.",
"ALTER SUBSCRIPTION ... SET (slot_name = NONE)")));
...
}

-- 
With Regards,
Amit Kapila.




Re: Proposal for Prototype Implementation to Enhance C/C++ Interoperability in PostgreSQL

2023-05-07 Thread 盏一
(Sorry, there was a problem with the format of the previous email content. I 
will send it in plain text format this time

> It seems extremely specific to one particular C++ implementation

To perform a force unwind during longjmp, the _Unwind_ForcedUnwind function is 
used. This function is defined in the [Itanium C++ ABI 
Standard](https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html#base-throw), 
which is followed by all C++ implementations. Additionally, the glibc 
[nptl/unwind.c](https://elixir.bootlin.com/glibc/latest/source/nptl/unwind.c#L130)
 file shows that on all platforms, pthread_exit is also implemented using 
_Unwind_ForcedUnwind.

Furthermore, the Itanium C++ ABI specification also defines 
_Unwind_RaiseException as the entry point for all C++ exceptions thrown.

> you've thrown in a new dependency on pthreads

The reason for the dependence on pthread is due to the overloading of 
_Unwind_RaiseException, which serves as the entry point for all C++ throwing 
exceptions. Some third-party C++ libraries may create threads internally and 
throw exceptions.

Overloading _Unwind_RaiseException is done to convert uncaught exceptions into 
elog(ERROR). If we require that all exceptions must be caught, we can remove 
the overloading of _Unwind_RaiseException and all pthread dependencies.

The overloading of _Unwind_RaiseException is just a fallback measure to prevent 
uncaught exceptions from terminating the process. In our code, this path is 
rarely taken, and once we encounter an exception that is not caught, we will 
fix the code to catch the exception.

> doesn't this require us to move our minimum language requirement to 
> C++-something?

No, all code has no dependency on C++.

regards, 盏一

Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-07 Thread Amit Kapila
On Sat, May 6, 2023 at 9:33 PM Drouvot, Bertrand
 wrote:
>
> On 5/6/23 3:28 PM, Amit Kapila wrote:
> > On Sat, May 6, 2023 at 1:52 PM Drouvot, Bertrand
> >  wrote:
>
> > Next steps:
> > =
> > 1. The first thing is we should verify this theory by adding some LOG
> > in KeepLogSeg() to see if the _logSegNo is reduced due to the value
> > returned by XLogGetReplicationSlotMinimumLSN().
>
> Yeah, will do that early next week.
>
> > 2. The reason for the required file not being removed in the primary
> > is also that it has a physical slot which prevents the file removal.
>
> Yeah, agree. But this one is not an issue as we are not
> checking for the WAL file removal on the primary, do you agree?
>

Agreed.

> > 3. If the above theory is correct then I see a few possibilities to
> > fix this test (a) somehow ensure that restart_lsn of the physical slot
> > on standby is advanced up to the point where we can safely remove the
> > required files; (b) just create a separate test case by initializing a
> > fresh node for primary and standby where we only have logical slots on
> > standby. This will be a bit costly but probably less risky. (c) any
> > better ideas?
> >
>
> (c): Since, I think, the physical slot on the primary is not a concern for
> the reason mentioned above, then instead of (b):
>
> What about postponing the physical slot creation on the standby and the
> cascading standby node initialization after this test?
>

Yeah, that is also possible. But, I have a few questions regarding
that: (a) There doesn't seem to be a physical slot on cascading
standby, if I am missing something, can you please point me to the
relevant part of the test? (b) Which test is currently dependent on
the physical slot on standby?

-- 
With Regards,
Amit Kapila.




Re: Proposal for Prototype Implementation to Enhance C/C++ Interoperability in PostgreSQL

2023-05-07 Thread 盏一
 It seems extremely specific to one particular C++ implementation


To perform a force unwind during longjmp, the _Unwind_ForcedUnwind function is 
used. This function is defined in the [Itanium C++ ABI 
Standard](https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html#base-throw), 
which is followed by all C++ implementations. Additionally, the glibc 
[nptl/unwind.c](https://elixir.bootlin.com/glibc/latest/source/nptl/unwind.c#L130)
 file shows that on all platforms, pthread_exit is also implemented using 
_Unwind_ForcedUnwind.


Furthermore, the Itanium C++ ABI specification also defines 
_Unwind_RaiseException as the entry point for all C++ exceptions thrown.


 you've thrown in a new dependency on pthreads



The reason for the dependence on pthread is due to the overloading of 
_Unwind_RaiseException, which serves as the entry point for all C++ throwing 
exceptions. Some third-party C++ libraries may create threads internally and 
throw exceptions.


Overloading _Unwind_RaiseException is done to convert uncaught exceptions into 
elog(ERROR). If we require that all exceptions must be caught, we can remove 
the overloading of _Unwind_RaiseException and all pthread dependencies.


The overloading of _Unwind_RaiseException is just a fallback measure to prevent 
uncaught exceptions from terminating the process. In our code, this path is 
rarely taken, and once we encounter an exception that is not caught, we will 
fix the code to catch the exception.


 doesn't this require us to move our minimum language requirement to 
C++-something?



No, all code has no dependency on C++.

--Original--
From: "t...@sss.pgh.pa.us"https://github.com/postgres/postgres/commit/1a9a2790430f256d9d0cc371249e43769d93eb8e#diff-6b6034caa00ddf38f641cbd10d5a5d1bb7135f8b23c5a879e9703bd11bd8240f).
 I would appreciate it if you could review the implementation and provide 
feedback.

... but I think this patch has no hope of being adequately portable.
It seems extremely specific to one particular C++ implementation
(unless you can show that every single thing you've used here is
in the C++ standard), and then for good measure you've thrown in
a new dependency on pthreads. On top of that, doesn't this
require us to move our minimum language requirement to C++-something?
We just barely got done deciding C99 was okay to use.

 regards, tom lane

Re: 2023-05-11 release announcement draft

2023-05-07 Thread David Rowley
Thanks for working on this.

On Sun, 7 May 2023 at 15:37, Jonathan S. Katz  wrote:
> Please provide any suggestions, corrections, or notable omissions no
> later than 2023-05-11 0:00 AoE.

For this one:

> * Fix partition pruning logic for partitioning on boolean columns when using a
> `IS NOT TRUE` condition.

Just to explain this a little further:  Effectively the code thought
"IS NOT TRUE" meant "IS FALSE", and "IS NOT FALSE" meant "IS TRUE".
That was wrong because each of the NOT cases should have allowed
NULLs.

Maybe the wording can be adjusted to mention NULLs. Maybe something
along the lines of:

* Fix partition pruning bug with the boolean "IS NOT TRUE" and "IS NOT
FALSE" conditions. NULL partitions were accidentally pruned when they
shouldn't have been.

David




Re: Introduce join_info_array for direct lookups of SpecialJoinInfo by ojrelid

2023-05-07 Thread Richard Guo
On Thu, May 4, 2023 at 4:07 PM Richard Guo  wrote:

> When working on the improper qual pushdown issue [1], there is a need in
> the proposed fix to avoid scanning all the SpecialJoinInfos, since that
> is too expensive.  I think this might be a common requirement.  In the
> current codes there are several places where we need to scan all the
> SpecialJoinInfos in join_info_list looking for SpecialJoinInfos that
> belong to a given outer join relid set, which is an O(n) operation.  So
> start a new thread for this requirement.
>
> To improve the O(n) operation, introduce join_info_array to allow direct
> lookups of SpecialJoinInfo by ojrelid.  This is doable because for each
> non-zero ojrelid there can only be one SpecialJoinInfo.  This can
> benefit clause_is_computable_at() and have_unsafe_outer_join_ref(), as
> the patch does, and more future usages such as
> add_outer_joins_to_relids() in the proposed patch for issue [1].
>

BTW, I just noticed that the introduction of join_info_array can also
benefit make_outerjoininfo(), check_redundant_nullability_qual() and
get_join_domain_min_rels().  So update the patch to do the changes.

I'd like to devise a test query that shows performance gain from this
patch, but I'm not sure how to do that.  May need help here.

Any thoughts on this patch?

Thanks
Richard


v2-0001-Allow-direct-lookups-of-SpecialJoinInfo-by-ojrelid.patch
Description: Binary data


Re: Add LZ4 compression in pg_dump

2023-05-07 Thread Tomas Vondra



On 5/7/23 17:01, gkokola...@pm.me wrote:
> 
> 
> 
> On Sat, May 6, 2023 at 04:51, Michael Paquier   wrote:
>> On Fri, May 05, 2023 at 02:13:28PM +, gkokola...@pm.me wrote:
>> > Good point. I thought about it before submitting the patch. I
>> > concluded that given the complexity and operations involved in
>> > LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
>> > code, the memset() call will be negligible. However from the
>> > readability point of view, the function is a bit cleaner with the
>> > memset().
>> >
>> > I will not object to any suggestion though, as this is a very
>> > trivial point. Please find attached a v2 of the patch following the
>> > suggested approach.
>>
>> Please note that an open item has been added for this stuff.
> Thank you but I am not certain I know what that means. Can you please
> explain?
> 

It means it was added to the list of items we need to fix before PG16
gets out:

https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Add LZ4 compression in pg_dump

2023-05-07 Thread Tom Lane
I wrote:
> Michael Paquier  writes:
>> While testing this patch, I have triggered an error pointing out that
>> the decompression path of LZ4 is broken for table data.  I can
>> reproduce that with a dump of the regression database, as of:
>> make installcheck
>> pg_dump --format=d --file=dump_lz4 --compress=lz4 regression

> Ugh.  Reproduced here ... so we need an open item for this.

BTW, it seems to work with --format=c.

regards, tom lane




Re: Add LZ4 compression in pg_dump

2023-05-07 Thread Michael Paquier
On Sun, May 07, 2023 at 09:09:25PM -0400, Tom Lane wrote:
> Ugh.  Reproduced here ... so we need an open item for this.

Yep.  Already added.
--
Michael


signature.asc
Description: PGP signature


Re: Add LZ4 compression in pg_dump

2023-05-07 Thread Tom Lane
Michael Paquier  writes:
> While testing this patch, I have triggered an error pointing out that
> the decompression path of LZ4 is broken for table data.  I can
> reproduce that with a dump of the regression database, as of:
> make installcheck
> pg_dump --format=d --file=dump_lz4 --compress=lz4 regression
> createdb regress_lz4
> pg_restore --format=d -d regress_lz4 dump_lz4
> pg_restore: error: COPY failed for table "clstr_tst": ERROR:  extra data 
> after last expected column
> CONTEXT:  COPY clstr_tst, line 15: "326   seis
> xyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzy..."
> pg_restore: warning: errors ignored on restore: 1

Ugh.  Reproduced here ... so we need an open item for this.

regards, tom lane




Re: 2023-05-11 release announcement draft

2023-05-07 Thread Jonathan S. Katz

On 5/7/23 1:09 AM, Erik Rijkers wrote:

Op 5/7/23 om 05:37 schreef Jonathan S. Katz:
Attached is a draft of the release announcement for the upcoming 
update release on May 11, 2023.


Please provide any suggestions, corrections, or notable omissions no 
later than 2023-05-11 0:00 AoE.


'leak in within a'  should be
'leak within a'


Thanks for that catch! Revision attached.

Jonathan

The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 15.3, 14.8, 13.11, 12.15, and 11.20.
This release fixes over 80 bugs reported over the last several months.

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

PostgreSQL 11 EOL Notice


PostgreSQL 11 will stop receiving fixes on November 9, 2023. If you are
running PostgreSQL 11 in a production environment, we suggest that you make
plans to upgrade to a newer, supported version of PostgreSQL. Please see our
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

Bug Fixes and Improvements
--
 
This update fixes over 80 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 15. Some of these issues may also
affect other supported versions of PostgreSQL.

Included in this release:

* Several fixes for [`CREATE 
DATABASE`](https://www.postgresql.org/docs/current/sql-createdatabase.html)
when using the `STRATEGY = WAL_LOG`, including a potential corruption that could
lose modifications to a template/source database.
* Fix crash with [`CREATE SCHEMA 
AUTHORIZATION`](https://www.postgresql.org/docs/current/sql-createschema.html).
* Several fixes for 
[`MERGE`](https://www.postgresql.org/docs/current/sql-merge.html).
* Several fixes for triggers in partitioned tables.
* Disallow altering composite types that are stored in indexes.
* Ensure that [`COPY TO`](https://www.postgresql.org/docs/current/sql-copy.html)
from a parent table with [row-level 
security](https://www.postgresql.org/docs/current/ddl-rowsecurity.html)
enabled does not copy any rows from child tables.
* Adjust text-search-related character classification logic to correctly detect
whether the prevailing locale is C when the default collation of a database uses
the ICU provider.
* Re-allow exponential notation in ISO-8601 interval fields.
* Improve error reporting for various invalid JSON string literals.
* Fix data corruption due to `vacuum_defer_cleanup_age` being larger than the
current 64-bit xid.
* Several fixes for the query parser and planner, including better detection of
improperly-nested aggregates.
* Fix partition pruning logic for partitioning on boolean columns when using a
`IS NOT TRUE` condition.
* Fix memory leak in [Memoize 
plan](https://www.postgresql.org/docs/current/runtime-config-query.html#GUC-ENABLE-MEMOIZE)
execution.
* Fix buffer refcount leak on foreign tables using partitions when performing
batched inserts.
* Restore support for sub-millisecond `vacuum_cost_delay` settings.
* Several fixes for
[views and rules](https://www.postgresql.org/docs/current/rules-views.html).
* Avoid unnecessary work while scanning a multi-column
[BRIN index](https://www.postgresql.org/docs/current/brin.html) with multiple
scan keys.
* Ignore dropped columns and generated columns during logical replication of an
`UPDATE` or `DELETE` action.
* Several fixes for naming and availability of wait events.
* Support RSA-PSS certificates with
[SCRAM-SHA-256](https://www.postgresql.org/docs/current/sasl-authentication.html#SASL-SCRAM-SHA-256)
channel binding. This feature requires building with OpenSSL 1.1.1 or newer.
* Avoid race condition with process ID tracking on Windows.
* Fix memory leak within a session for 
[PL/pgSQL](https://www.postgresql.org/docs/current/plpgsql.html)
[`DO`](https://www.postgresql.org/docs/current/sql-do.html) blocks that use cast
expressions.
* Tighten array dimensionality checks from
[PL/Perl](https://www.postgresql.org/docs/current/plperl.html) and
[PL/Python](https://www.postgresql.org/docs/current/plpython.html) when
converting list structures to multi-dimensional SQL arrays.
* Fix [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html) so
that partitioned tables that are hash-partitioned on an
[enumerated type](https://www.postgresql.org/docs/current/datatype-enum.html)
column can be restored successfully.
* Fix for [`pg_trgm`](https://www.postgresql.org/docs/current/pgtrgm.html) where
an unsatisfiable regular expression could lead to a crash when using a GiST or
GIN index.
* Limit memory usage of `pg_get_wal_records_info()` in
[`pg_walinspect`](https://www.postgresql.org/docs/current/pgwalinspect.html).

This release also updates time zone data files to tzdata release 2023c for DST
law changes in Egypt, Greenland, Morocco, and Palestine. When observing Moscow
time, Europe/Kirov and Europe/Volgograd now use the abbreviations MSK/MSD

Re: Add LZ4 compression in pg_dump

2023-05-07 Thread Michael Paquier
On Fri, May 05, 2023 at 02:13:28PM +, gkokola...@pm.me wrote:
> Good point. I thought about it before submitting the patch. I
> concluded that given the complexity and operations involved in
> LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
> code, the memset() call will be negligible. However from the
> readability point of view, the function is a bit cleaner with the
> memset(). 
> 
> I will not object to any suggestion though, as this is a very
> trivial point. Please find attached a v2 of the patch following the
> suggested approach.

Hmm.  I was looking at this patch, and what you are trying to do
sounds rather right to keep a parallel with the gzip and zstd code
paths.

Looking at the code of gzread.c, gzgets() enforces a null-termination
on the string read.  Still, isn't that something we'd better enforce
in read_none() as well?  compress_io.h lists this as a requirement of
the callback, and Zstd_gets() does so already.  read_none() does not
enforce that, unfortunately.

+   /* No work needs to be done for a zero-sized output buffer */
+   if (size <= 0)
+   return 0;

Indeed.  This should be OK.

-   ret = LZ4Stream_read_internal(state, ptr, size, true);
+   Assert(size > 1);

The addition of this assertion is a bit surprising, and this is
inconsistent with Zstd_gets where a length of 1 is authorized.  We
should be more consistent across all the callbacks, IMO, not less, so
as we apply the same API contract across all the compression methods.

While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data.  I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regression
createdb regress_lz4
pg_restore --format=d -d regress_lz4 dump_lz4
pg_restore: error: COPY failed for table "clstr_tst": ERROR:  extra data after 
last expected column
CONTEXT:  COPY clstr_tst, line 15: "32  6   seis
xyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzy..."
pg_restore: warning: errors ignored on restore: 1

This does not show up with gzip or zstd, and the patch does not
influence the result.  In short it shows up with and without the
patch, on HEAD.  That does not look really stable :/
--
Michael


signature.asc
Description: PGP signature


Re: PGDOCS - Replica Identity quotes

2023-05-07 Thread Peter Smith
On Sat, May 6, 2023 at 5:28 AM David Zhang  wrote:
>
> On 2023-03-16 4:46 p.m., Peter Smith wrote:
> > A rebase was needed due to the recent REPLICA IDENTITY push [1].
> >
> > PSA v2.
> >
> > 
> > -   A published table must have a replica identity 
> > configured in
> > +   A published table must have a replica identity 
> > configured in
> +1
> >  order to be able to replicate UPDATE
> >  and DELETE operations, so that appropriate rows to
> >  update or delete can be identified on the subscriber side.  By default,
> >  this is the primary key, if there is one.  Another unique index (with
> >  certain additional requirements) can also be set to be the replica
> >  identity.  If the table does not have any suitable key, then it can be 
> > set
> > -   to replica identity full, which means the entire row 
> > becomes
> > -   the key.  When replica identity full is specified,
> > +   to REPLICA IDENTITY FULL, which means the entire row 
> > becomes
> > +   the key.  When REPLICA IDENTITY FULL is specified,
> >  indexes can be used on the subscriber side for searching the rows.  
> > Candidate
> >  indexes must be btree, non-partial, and have at least one column 
> > reference
> >  (i.e. cannot consist of only expressions).  These restrictions
> >  on the non-unique index properties adhere to some of the restrictions 
> > that
> >  are enforced for primary keys.  If there are no such suitable indexes,
> >  the search on the subscriber side can be very inefficient, therefore
> > -   replica identity full should only be used as a
> > +   REPLICA IDENTITY FULL should only be used as a
> >  fallback if no other solution is possible.  If a replica identity other
> IMO, it would be better just change "full" to "FULL". On one side, it
> can emphasize that "FULL" is one of the specific values (DEFAULT | USING
> INDEX index_name | FULL | NOTHING); on the other side, it leaves
> "replica identity" in lowercase to be more consistent with the
> terminology used in this entire paragraph.
> > -   than full is set on the publisher side, a replica 
> > identity
> > +   than FULL is set on the publisher side, a replica 
> > identity
> +1
> >  comprising the same or fewer columns must also be set on the subscriber
> >  side.  See  for 
> > details on
> >  how to set the replica identity.  If a table without a replica 
> > identity is
>

Thanks for giving some feedback on my patch.

PSA v3 which is changed per your suggestion.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v3-0001-PGDOCS-replica-identity-quotes.patch
Description: Binary data


Re: Add LZ4 compression in pg_dump

2023-05-07 Thread Michael Paquier
On Sun, May 07, 2023 at 03:01:52PM +, gkokola...@pm.me wrote:
> Thank you but I am not certain I know what that means. Can you please explain?

It means that this thread has been added to the following list:
https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Open_Issues

pg_dump/compress_lz4.c is new as of PostgreSQL 16, and this patch is
fixing a deficiency.  That's just a way outside of the commit fest to
track any problems and make sure these are fixed before the release
happens.
--
Michael


signature.asc
Description: PGP signature


Re: psql: Add role's membership options to the \du+ command

2023-05-07 Thread Pavel Luzanov

On 05.05.2023 19:51, David G. Johnston wrote:
But if it is really a blocker then maybe we should produce 3 separate 
newline separated columns, one for the member of role, one for the 
list of attributes, and one with the grantor.  The column headers can 
be translated more easily as single nouns.  The readability quite 
probably would end up being equivalent (maybe even better) in tabular 
form instead of sentence form.


Just to visualize this approach. Below are the output for the tabular 
form and the sentence form from last patch version (sql script attached):


Tabular form rolname  | memberof |   options   
| grantor 
--+--+-+-- postgres 
|  | |  regress_du_admin | 
regress_du_role0+| admin, inherit, set+| postgres    
+  | regress_du_role1+| admin, inherit, set+| 
postgres    +  | regress_du_role2 | admin, inherit, 
set | postgres regress_du_role0 |  | 
|  regress_du_role1 | regress_du_role0+| admin, inherit, set+| 
regress_du_admin+  | regress_du_role0+| 
inherit    +| regress_du_role1+  | 
regress_du_role0 | set | 
regress_du_role2 regress_du_role2 | regress_du_role0+| 
admin  +| regress_du_admin+  | 
regress_du_role0+| inherit, set   +| 
regress_du_role1+  | regress_du_role0+| 
empty  +| regress_du_role2+  | 
regress_du_role1 | admin, set  | regress_du_admin(5 
rows)Sentence form from patch v7 rolname  
|   memberof 
--+-- postgres 
|  regress_du_admin | regress_du_role0 from postgres (admin, inherit, 
set)    +  | regress_du_role1 from postgres (admin, 
inherit, set)    +  | regress_du_role2 from postgres 
(admin, inherit, set) regress_du_role0 |  regress_du_role1 | 
regress_du_role0 from regress_du_admin (admin, inherit, 
set)+  | regress_du_role0 from regress_du_role1 
(inherit)    +  | regress_du_role0 from 
regress_du_role2 (set) regress_du_role2 | regress_du_role0 from 
regress_du_admin (admin)  +  | 
regress_du_role0 from regress_du_role1 (inherit, set)   
+  | regress_du_role0 from regress_du_role2 
(empty)  +  | regress_du_role1 from 
regress_du_admin (admin, set)(5 rows)  


The tabular form solves the latest patch translation problems mentioned by 
Kyotaro.
But it requires mapping elements between 3 array-like columns.

To move forward, needs more opinions?

 
-

Pavel Luzanov


t.sql
Description: application/sql


Re: MERGE lacks ruleutils.c decompiling support!?

2023-05-07 Thread Alvaro Herrera
On 2023-May-07, Tom Lane wrote:

> I wrote:
> > Ping?  The hours grow short, if we're to get this into 15.3.
> 
> I went ahead and pushed v2, since we can't wait any longer if
> we're to get reasonable buildfarm coverage before 15.3 wraps.

Much appreciated.  I wanted to get back to this yesterday but was unable
to.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Proposal for Prototype Implementation to Enhance C/C++ Interoperability in PostgreSQL

2023-05-07 Thread Tom Lane
"=?utf-8?B?55uP5LiA?="  writes:
> The proposed implementation can significantly improve the interoperability 
> between C and C++ code in PostgreSQL. It allows for seamless integration of 
> C++ code with PostgreSQL, without the need for complex workarounds or 
> modifications to the existing codebase.

That'd be nice to have, certainly ...

> I have submitted the implementation 
> on[GitHub](https://github.com/postgres/postgres/commit/1a9a2790430f256d9d0cc371249e43769d93eb8e#diff-6b6034caa00ddf38f641cbd10d5a5d1bb7135f8b23c5a879e9703bd11bd8240f).
>  I would appreciate it if you could review the implementation and provide 
> feedback.

... but I think this patch has no hope of being adequately portable.
It seems extremely specific to one particular C++ implementation
(unless you can show that every single thing you've used here is
in the C++ standard), and then for good measure you've thrown in
a new dependency on pthreads.  On top of that, doesn't this
require us to move our minimum language requirement to C++-something?
We just barely got done deciding C99 was okay to use.

regards, tom lane




Re: MERGE lacks ruleutils.c decompiling support!?

2023-05-07 Thread Tom Lane
I wrote:
> Ping?  The hours grow short, if we're to get this into 15.3.

I went ahead and pushed v2, since we can't wait any longer if
we're to get reasonable buildfarm coverage before 15.3 wraps.

regards, tom lane




Re: Add LZ4 compression in pg_dump

2023-05-07 Thread gkokolatos
On Sat, May 6, 2023 at 04:51, Michael Paquier <[mich...@paquier.xyz](mailto:On 
Sat, May 6, 2023 at 04:51, Michael Paquier < wrote:

> On Fri, May 05, 2023 at 02:13:28PM +, gkokola...@pm.me wrote:
>> Good point. I thought about it before submitting the patch. I
>> concluded that given the complexity and operations involved in
>> LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
>> code, the memset() call will be negligible. However from the
>> readability point of view, the function is a bit cleaner with the
>> memset().
>>
>> I will not object to any suggestion though, as this is a very
>> trivial point. Please find attached a v2 of the patch following the
>> suggested approach.
>
> Please note that an open item has been added for this stuff.

Thank you but I am not certain I know what that means. Can you please explain?

Cheers,
//Georgios

> --
> Michael

Re: Missing update of all_hasnulls in BRIN opclasses

2023-05-07 Thread Tomas Vondra



On 5/7/23 07:08, Julien Rouhaud wrote:
> Hi,
> 
> On Sun, May 07, 2023 at 12:13:07AM +0200, Tomas Vondra wrote:
>>
>> c) ignore the issue - AFAICS this would be an issue only for (external)
>> code accessing BrinMemTuple structs, but I don't think we're aware of
>> any out-of-core BRIN opclasses or anything like that ...
> 
> FTR there's at least postgis that implments BRIN opclasses (for geometries and
> geographies), but there's nothing fancy in the implementation and it doesn't
> access BrinMemTuple struct.

Right. I believe that should be fine, because opclasses don't access the
tuple directly - instead we pass pointers to individual pieces. But
maybe it'd be a good idea to test this.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [PATCH] Allow Postgres to pick an unused port to listen

2023-05-07 Thread Yurii Rashkovskii
Hi Cary,

Thank you so much for the review. It's very valuable, and you caught an
important issue with it that I somehow missed (not updating the .pid file
with the selected port number). I'm not sure how it escaped me (perhaps I
was focusing too much on the log file to validate the behaviour).

I've amended the patch to ensure the port number is in the lock file. I've
attached V2.

Yurii


On Sat, May 6, 2023 at 12:31 AM Cary Huang  wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, failed
> Spec compliant:   not tested
> Documentation:not tested
>
> Hello
>
> This is one of those features that is beneficial to a handful of people in
> specific test cases. It may not benefit the majority of the users but is
> certainly not useless either. As long as it can be disabled and enough
> tests have been run to ensure it won't have a significant impact on working
> components while disabled, it should be fine in my opinion. Regarding where
> the selected port shall be saved (postmaster.pid, read by pg_ctl or saved
> in a dedicated file), I see that postmaster.pid already contains a port
> number in line number 4, so adding a port number into there is nothing new;
> port number is already there and we can simply replace the port number with
> the one selected by the system.
>
> I applied and tested the patch and found that the system can indeed start
> when port is set to 0, but line 4 of postmaster.pid does not store the port
> number selected by the system, rather, it stored 0, which is the same as
> configured. So I am actually not able to find out the port number that my
> PG is running on, at least not in a straight-forward way.
>
> thank you
> ==
> Cary Huang
> HighGo Software
> www.highgo.ca



-- 
Y.


V2-0001-Allow-listening-port-to-be-0.patch
Description: Binary data


Proposal for Prototype Implementation to Enhance C/C++ Interoperability in PostgreSQL

2023-05-07 Thread 盏一
Hi




I am writing to propose a prototype implementation that can greatly enhance the 
C/C++ interoperability in PostgreSQL. The implementation involves converting PG 
longjmp to[force 
unwind](https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html#base-throw), 
which triggers the destruction of local variables on the stack. Additionally, 
it converts throw statements that are not associated with catch to PG longjmp, 
thereby avoiding the call to terminate.




The proposed implementation can significantly improve the interoperability 
between C and C++ code in PostgreSQL. It allows for seamless integration of C++ 
code with PostgreSQL, without the need for complex workarounds or modifications 
to the existing codebase.




I have submitted the implementation 
on[GitHub](https://github.com/postgres/postgres/commit/1a9a2790430f256d9d0cc371249e43769d93eb8e#diff-6b6034caa00ddf38f641cbd10d5a5d1bb7135f8b23c5a879e9703bd11bd8240f).
 I would appreciate it if you could review the implementation and provide 
feedback.




Thank you for your time and consideration.




Best regards,

盏一
--