date:20251013

pgsql: Eliminate XLOG_HEAP2_VISIBLE from vacuum phase III

2025-10-13 Thread Melanie Plageman

Eliminate XLOG_HEAP2_VISIBLE from vacuum phase III

Instead of emitting a separate XLOG_HEAP2_VISIBLE WAL record for each
page that becomes all-visible in vacuum's third phase, specify the VM
changes in the already emitted XLOG_HEAP2_PRUNE_VACUUM_CLEANUP record.

Visibility checks are now performed before marking dead items unused.
This is safe because the heap page is held under exclusive lock for the
entire operation.

This reduces the number of WAL records generated by VACUUM phase III by
up to 50%.

Author: Melanie Plageman 
Reviewed-by: Andres Freund 
Reviewed-by: Robert Haas 
Reviewed-by: Kirill Reshke 
Discussion: 
https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/add323da40a6bf9e01cdda510e32ea924c89cd1a

Modified Files
--
src/backend/access/heap/heapam_xlog.c  | 128 +++--
src/backend/access/heap/pruneheap.c|  55 ++-
src/backend/access/heap/vacuumlazy.c   | 166 -
src/backend/access/rmgrdesc/heapdesc.c |  11 ++-
src/include/access/heapam.h|   1 +
src/include/access/heapam_xlog.h   |  17 +++-
6 files changed, 302 insertions(+), 76 deletions(-)

pgsql: Fix incorrect message-printing in win32security.c.

2025-10-13 Thread Tom Lane

Fix incorrect message-printing in win32security.c.

log_error() would probably fail completely if used, and would
certainly print garbage for anything that needed to be interpolated
into the message, because it was failing to use the correct printing
subroutine for a va_list argument.

This bug likely went undetected because the error cases this code
is used for are rarely exercised - they only occur when Windows
security API calls fail catastrophically (out of memory, security
subsystem corruption, etc).

The FRONTEND variant can be fixed just by calling vfprintf()
instead of fprintf().  However, there was no va_list variant
of write_stderr(), so create one by refactoring that function.
Following the usual naming convention for such things, call
it vwrite_stderr().

Author: Bryan Green 
Reviewed-by: Tom Lane 
Discussion: 
https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=tzam+tw...@mail.gmail.com
Backpatch-through: 13

Branch
--
REL_13_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/75a555d61b3b407c904f41c6a5cb922675f237e2

Modified Files
--
src/backend/utils/error/elog.c | 16 +---
src/include/utils/elog.h   |  1 +
src/port/win32security.c   |  4 ++--
3 files changed, 16 insertions(+), 5 deletions(-)

pgsql: Fix incorrect message-printing in win32security.c.

2025-10-13 Thread Tom Lane

Fix incorrect message-printing in win32security.c.

log_error() would probably fail completely if used, and would
certainly print garbage for anything that needed to be interpolated
into the message, because it was failing to use the correct printing
subroutine for a va_list argument.

This bug likely went undetected because the error cases this code
is used for are rarely exercised - they only occur when Windows
security API calls fail catastrophically (out of memory, security
subsystem corruption, etc).

The FRONTEND variant can be fixed just by calling vfprintf()
instead of fprintf().  However, there was no va_list variant
of write_stderr(), so create one by refactoring that function.
Following the usual naming convention for such things, call
it vwrite_stderr().

Author: Bryan Green 
Reviewed-by: Tom Lane 
Discussion: 
https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=tzam+tw...@mail.gmail.com
Backpatch-through: 13

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/03bf7a12c5a44ced377352c8f9bf6e9f4b863885

Modified Files
--
src/backend/utils/error/elog.c | 16 +---
src/include/utils/elog.h   |  1 +
src/port/win32security.c   |  4 ++--
3 files changed, 16 insertions(+), 5 deletions(-)

pgsql: Remove extra semicolon in example

2025-10-13 Thread Magnus Hagander

Remove extra semicolon in example

Reported-By: Pavel Luzanov 
Discussion: 
https://postgr.es/m/[email protected]
Backpatch-through: 18

Branch
--
REL_18_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/656736402f54f840a90acfae7db826c53d0a8b1d

Modified Files
--
doc/src/sgml/func.sgml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

pgsql: Remove extra semicolon in example

2025-10-13 Thread Magnus Hagander

Remove extra semicolon in example

Reported-By: Pavel Luzanov 
Discussion: 
https://postgr.es/m/[email protected]
Backpatch-through: 18

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/e062af861b453769ccf3940fe7920a87d7d31fe6

Modified Files
--
doc/src/sgml/func/func-json.sgml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

pgsql: Remove state.tmp when failing to save a replication slot

2025-10-13 Thread Michael Paquier

Remove state.tmp when failing to save a replication slot

An error happening while a slot data is saved on disk in
SaveSlotToPath() could cause a state.tmp file (temporary file holding
the slot state data, renamed to its permanent name at the end of the
function) to remain around after it has been created.  This temporary
file is created with O_EXCL, meaning that if an existing state.tmp is
found, its creation would fail.  This would prevent the slot data to be
saved, requiring a manual intervention to remove state.tmp before being
able to save again a slot.  Possible scenarios where this temporary file
could remain on disk is for example a ENOSPC case (no disk space) while
writing, syncing or renaming it.  The bug reports point to a write
failure as the principal cause of the problems.

Using O_TRUNC has been argued back in 2019 as a potential solution to
discard any temporary file that could exist.  This solution was rejected
as O_EXCL can also act as a safety measure when saving the slot state,
crash recovery offering cleanup guarantees post-crash.  This commit uses
the alternative approach that has been suggested by Andres Freund back
in 2019.  When the temporary state file cannot be written, synced,
closed or renamed (note: not when created!), an unlink() is used to
remove the temporary state file while holding the in-progress I/O
LWLock, so as any follow-up attempts to save a slot's data would not
choke on an existing file that remained around because of a previous
failure.

This problem has been reported a few times across the years, going back
to 2019, but for some reason I have never come back to do something
about it and it has been forgotten.  A recent report has reminded me
that this was still a problem.

Reported-by: Kevin K Biju 
Reported-by: Sergei Kornilov 
Reported-by: Grigory Smolkin 
Discussion: 
https://postgr.es/m/cam45keha32sokl_g8vk38cwvtbeooxcsxapas7jt7yprf2m...@mail.gmail.com
Discussion: 
https://postgr.es/m/[email protected]
Discussion: 
https://postgr.es/m/[email protected]
Backpatch-through: 13

Branch
--
REL_15_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/0adf424b490b701ae8eeeca3d9630773f18cd937

Modified Files
--
src/backend/replication/slot.c | 7 +++
1 file changed, 7 insertions(+)

pgsql: Fix serious performance problems in LZ4Stream_read_internal.

2025-10-13 Thread Tom Lane

Fix serious performance problems in LZ4Stream_read_internal.

I was distressed to find that reading an LZ4-compressed toc.dat
file was hundreds of times slower than it ought to be.  On
investigation, the blame mostly affixes to LZ4Stream_read_overflow's
habit of memmove'ing all the remaining buffered data after each read
operation.  Since reading a TOC file tends to involve a lot of small
(even one-byte) decompression calls, that amounts to an O(N^2) cost.

This could have been fixed with a minimal patch, but to my
eyes LZ4Stream_read_internal and LZ4Stream_read_overflow are
badly-written spaghetti code; in particular the eol_flag logic
is inefficient and duplicative.  I chose to throw the code
away and rewrite from scratch.  This version is about sixty
lines shorter as well as not having the performance issue.

Fortunately, AFAICT the only way to get to this problem is to
manually LZ4-compress the toc.dat and/or blobs.toc files within a
directory-style archive; in the main data files, we read blocks
that are large enough that the O(N^2) behavior doesn't manifest.
Few people do that, which likely explains the lack of field
complaints.  Otherwise this performance bug might be considered
bad enough to warrant back-patching.

Author: Tom Lane 
Reviewed-by: Chao Li 
Discussion: https://postgr.es/m/[email protected]

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/1f8062dd9668572d66549fc798a7d2057aa34ee1

Modified Files
--
src/bin/pg_dump/compress_lz4.c | 242 +++--
1 file changed, 89 insertions(+), 153 deletions(-)

pgsql: Fix version number calculation for data folder flush in pg_combi

2025-10-13 Thread Michael Paquier

Fix version number calculation for data folder flush in pg_combinebackup

The version number calculated by read_pg_version_file() is multiplied
once by 1, to be able to do comparisons based on PG_VERSION_NUM or
equivalents with a minor version included.  However, the version number
given sync_pgdata() was multiplied by 1 a second time, leading to an
overestimated number.

This issue was harmless (still incorrect) as pg_combinebackup does not
support versions of Postgres older than v10, and sync_pgdata() only
includes a version check due to the rename of pg_xlog/ to pg_wal/.  This
folder rename happened in the development cycle of v10.  This would
become a problem if in the future  sync_pgdata() is changed to have more
version-specific checks.

Oversight in dc212340058b, so backpatch down to v17.

Reviewed-by: Chao Li 
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 17

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/1c05fe11abb6a1588b158e3f39b668053e24cdae

Modified Files
--
src/bin/pg_combinebackup/pg_combinebackup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

pgsql: Fix incorrect message-printing in win32security.c.

2025-10-13 Thread Tom Lane

Fix incorrect message-printing in win32security.c.

log_error() would probably fail completely if used, and would
certainly print garbage for anything that needed to be interpolated
into the message, because it was failing to use the correct printing
subroutine for a va_list argument.

This bug likely went undetected because the error cases this code
is used for are rarely exercised - they only occur when Windows
security API calls fail catastrophically (out of memory, security
subsystem corruption, etc).

The FRONTEND variant can be fixed just by calling vfprintf()
instead of fprintf().  However, there was no va_list variant
of write_stderr(), so create one by refactoring that function.
Following the usual naming convention for such things, call
it vwrite_stderr().

Author: Bryan Green 
Reviewed-by: Tom Lane 
Discussion: 
https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=tzam+tw...@mail.gmail.com
Backpatch-through: 13

Branch
--
REL_16_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/9883e3cd1c61b1bf23d655e62789ff56c3da3bb5

Modified Files
--
src/backend/utils/error/elog.c | 16 +---
src/include/utils/elog.h   |  1 +
src/port/win32security.c   |  4 ++--
3 files changed, 16 insertions(+), 5 deletions(-)

pgsql: Doc: clarify n_distinct_inherited setting

2025-10-13 Thread David Rowley

Doc: clarify n_distinct_inherited setting

There was some confusion around how to adjust the n_distinct estimates
for partitioned tables.  Here we try and clarify that
n_distinct_inherited needs to be adjusted rather than n_distinct.

Also fix some slightly misleading text which was talking about table
size rather than table rows, fix a grammatical error, and adjust some
text which indicated that ANALYZE was performing calculations based on
the n_distinct settings.  Really it's the query planner that does this
and ANALYZE only stores the overridden n_distinct estimate value in
pg_statistic.

Author: David Rowley 
Reviewed-by: David G. Johnston 
Reviewed-by: Chao Li 
Backpatch-through: 13
Discussion: 
https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-z...@mail.gmail.com

Branch
--
REL_16_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/0a47f054ef8d5b9f2d9702ba1a558953fc7c

Modified Files
--
doc/src/sgml/ref/alter_table.sgml | 34 --
1 file changed, 16 insertions(+), 18 deletions(-)

pgsql: Fix issue with reading zero bytes in Gzip_read.

2025-10-13 Thread Tom Lane

Fix issue with reading zero bytes in Gzip_read.

pg_dump expects a read request of zero bytes to be a no-op; see for
example ReadStr().  Gzip_read got this wrong and falsely supposed
that the resulting gzret == 0 indicated an error.  We could complicate
that error-checking logic some more, but it seems best to just fall
out immediately when passed size == 0.

This bug breaks the nominally-supported case of manually gzip'ing
the toc.dat file within a directory-style dump, so back-patch to v16
where this code came in.  (Prior branches already have a short-circuit
for size == 0 before their only gzread call.)

Author: Tom Lane 
Reviewed-by: Chao Li 
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 16

Branch
--
REL_16_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/1518b7d76aadcbdffa6214555b82b995e7404b38

Modified Files
--
src/bin/pg_dump/compress_gzip.c | 4 
1 file changed, 4 insertions(+)

pgsql: Fix poor buffering logic in pg_dump's lz4 and zstd compression c

2025-10-13 Thread Tom Lane

Fix poor buffering logic in pg_dump's lz4 and zstd compression code.

Both of these modules dumped each bit of output that they got from
the underlying compression library as a separate "data block" in
the emitted archive file.  In the case of zstd this'd frequently
result in block sizes well under 100 bytes; lz4 is a little better
but still produces blocks around 300 bytes, at least in the test
case I tried.  This bloats the archive file a little bit compared
to larger block sizes, but the real problem is that when pg_restore
has to skip each data block rather than seeking directly to some
target data, tiny block sizes are enormously inefficient.

Fix both modules so that they fill their allocated buffer reasonably
well before dumping a data block.  In the case of lz4, also delete
some redundant logic that caused the lz4 frame header to be emitted
as a separate data block.  (That saves little, but I see no reason
to expend extra code to get worse results.)

I fixed the "stream API" code too.  In those cases, feeding small
amounts of data to fwrite() probably doesn't have any meaningful
performance consequences.  But it seems like a bad idea to leave
the two sets of code doing the same thing in two different ways.

In passing, remove unnecessary "extra paranoia" check in
_ZstdWriteCommon.  _CustomWriteFunc (the only possible referent
of cs->writeF) already protects itself against zero-length writes,
and it's really a modularity violation for _ZstdWriteCommon to know
that the custom format disallows empty data blocks.  Also, fix
Zstd_read_internal to do less work when passed size == 0.

Reported-by: Dimitrios Apostolou 
Author: Tom Lane 
Reviewed-by: Chao Li 
Discussion: https://postgr.es/m/[email protected]

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/fe8192a95e6c7159d639e341740e32966c9cf385

Modified Files
--
src/bin/pg_dump/compress_lz4.c  | 167 +++-
src/bin/pg_dump/compress_zstd.c |  42 +-
2 files changed, 118 insertions(+), 91 deletions(-)

pgsql: Doc: clarify n_distinct_inherited setting

2025-10-13 Thread David Rowley

Doc: clarify n_distinct_inherited setting

There was some confusion around how to adjust the n_distinct estimates
for partitioned tables.  Here we try and clarify that
n_distinct_inherited needs to be adjusted rather than n_distinct.

Also fix some slightly misleading text which was talking about table
size rather than table rows, fix a grammatical error, and adjust some
text which indicated that ANALYZE was performing calculations based on
the n_distinct settings.  Really it's the query planner that does this
and ANALYZE only stores the overridden n_distinct estimate value in
pg_statistic.

Author: David Rowley 
Reviewed-by: David G. Johnston 
Reviewed-by: Chao Li 
Backpatch-through: 13
Discussion: 
https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-z...@mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/615a0fc2f1fd30df863b0dde2c045eaab8018ec6

Modified Files
--
doc/src/sgml/ref/alter_table.sgml | 34 --
1 file changed, 16 insertions(+), 18 deletions(-)

pgsql: Eliminate XLOG_HEAP2_VISIBLE from vacuum phase III

pgsql: Fix incorrect message-printing in win32security.c.

pgsql: Fix incorrect message-printing in win32security.c.

pgsql: Remove extra semicolon in example

pgsql: Remove extra semicolon in example

pgsql: Remove state.tmp when failing to save a replication slot

pgsql: Fix serious performance problems in LZ4Stream_read_internal.

pgsql: Fix version number calculation for data folder flush in pg_combi

pgsql: Fix incorrect message-printing in win32security.c.

pgsql: Doc: clarify n_distinct_inherited setting

pgsql: Fix issue with reading zero bytes in Gzip_read.

pgsql: Fix poor buffering logic in pg_dump's lz4 and zstd compression c

pgsql: Doc: clarify n_distinct_inherited setting

13 matches

Site Navigation

Mail list logo

Footer information