pgsql: Eliminate XLOG_HEAP2_VISIBLE from vacuum phase III
Eliminate XLOG_HEAP2_VISIBLE from vacuum phase III Instead of emitting a separate XLOG_HEAP2_VISIBLE WAL record for each page that becomes all-visible in vacuum's third phase, specify the VM changes in the already emitted XLOG_HEAP2_PRUNE_VACUUM_CLEANUP record. Visibility checks are now performed before marking dead items unused. This is safe because the heap page is held under exclusive lock for the entire operation. This reduces the number of WAL records generated by VACUUM phase III by up to 50%. Author: Melanie Plageman Reviewed-by: Andres Freund Reviewed-by: Robert Haas Reviewed-by: Kirill Reshke Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/add323da40a6bf9e01cdda510e32ea924c89cd1a Modified Files -- src/backend/access/heap/heapam_xlog.c | 128 +++-- src/backend/access/heap/pruneheap.c| 55 ++- src/backend/access/heap/vacuumlazy.c | 166 - src/backend/access/rmgrdesc/heapdesc.c | 11 ++- src/include/access/heapam.h| 1 + src/include/access/heapam_xlog.h | 17 +++- 6 files changed, 302 insertions(+), 76 deletions(-)
pgsql: Fix incorrect message-printing in win32security.c.
Fix incorrect message-printing in win32security.c. log_error() would probably fail completely if used, and would certainly print garbage for anything that needed to be interpolated into the message, because it was failing to use the correct printing subroutine for a va_list argument. This bug likely went undetected because the error cases this code is used for are rarely exercised - they only occur when Windows security API calls fail catastrophically (out of memory, security subsystem corruption, etc). The FRONTEND variant can be fixed just by calling vfprintf() instead of fprintf(). However, there was no va_list variant of write_stderr(), so create one by refactoring that function. Following the usual naming convention for such things, call it vwrite_stderr(). Author: Bryan Green Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=tzam+tw...@mail.gmail.com Backpatch-through: 13 Branch -- REL_13_STABLE Details --- https://git.postgresql.org/pg/commitdiff/75a555d61b3b407c904f41c6a5cb922675f237e2 Modified Files -- src/backend/utils/error/elog.c | 16 +--- src/include/utils/elog.h | 1 + src/port/win32security.c | 4 ++-- 3 files changed, 16 insertions(+), 5 deletions(-)
pgsql: Fix incorrect message-printing in win32security.c.
Fix incorrect message-printing in win32security.c. log_error() would probably fail completely if used, and would certainly print garbage for anything that needed to be interpolated into the message, because it was failing to use the correct printing subroutine for a va_list argument. This bug likely went undetected because the error cases this code is used for are rarely exercised - they only occur when Windows security API calls fail catastrophically (out of memory, security subsystem corruption, etc). The FRONTEND variant can be fixed just by calling vfprintf() instead of fprintf(). However, there was no va_list variant of write_stderr(), so create one by refactoring that function. Following the usual naming convention for such things, call it vwrite_stderr(). Author: Bryan Green Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=tzam+tw...@mail.gmail.com Backpatch-through: 13 Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/03bf7a12c5a44ced377352c8f9bf6e9f4b863885 Modified Files -- src/backend/utils/error/elog.c | 16 +--- src/include/utils/elog.h | 1 + src/port/win32security.c | 4 ++-- 3 files changed, 16 insertions(+), 5 deletions(-)
pgsql: Remove extra semicolon in example
Remove extra semicolon in example Reported-By: Pavel Luzanov Discussion: https://postgr.es/m/[email protected] Backpatch-through: 18 Branch -- REL_18_STABLE Details --- https://git.postgresql.org/pg/commitdiff/656736402f54f840a90acfae7db826c53d0a8b1d Modified Files -- doc/src/sgml/func.sgml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
pgsql: Remove extra semicolon in example
Remove extra semicolon in example Reported-By: Pavel Luzanov Discussion: https://postgr.es/m/[email protected] Backpatch-through: 18 Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/e062af861b453769ccf3940fe7920a87d7d31fe6 Modified Files -- doc/src/sgml/func/func-json.sgml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
pgsql: Remove state.tmp when failing to save a replication slot
Remove state.tmp when failing to save a replication slot An error happening while a slot data is saved on disk in SaveSlotToPath() could cause a state.tmp file (temporary file holding the slot state data, renamed to its permanent name at the end of the function) to remain around after it has been created. This temporary file is created with O_EXCL, meaning that if an existing state.tmp is found, its creation would fail. This would prevent the slot data to be saved, requiring a manual intervention to remove state.tmp before being able to save again a slot. Possible scenarios where this temporary file could remain on disk is for example a ENOSPC case (no disk space) while writing, syncing or renaming it. The bug reports point to a write failure as the principal cause of the problems. Using O_TRUNC has been argued back in 2019 as a potential solution to discard any temporary file that could exist. This solution was rejected as O_EXCL can also act as a safety measure when saving the slot state, crash recovery offering cleanup guarantees post-crash. This commit uses the alternative approach that has been suggested by Andres Freund back in 2019. When the temporary state file cannot be written, synced, closed or renamed (note: not when created!), an unlink() is used to remove the temporary state file while holding the in-progress I/O LWLock, so as any follow-up attempts to save a slot's data would not choke on an existing file that remained around because of a previous failure. This problem has been reported a few times across the years, going back to 2019, but for some reason I have never come back to do something about it and it has been forgotten. A recent report has reminded me that this was still a problem. Reported-by: Kevin K Biju Reported-by: Sergei Kornilov Reported-by: Grigory Smolkin Discussion: https://postgr.es/m/cam45keha32sokl_g8vk38cwvtbeooxcsxapas7jt7yprf2m...@mail.gmail.com Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Backpatch-through: 13 Branch -- REL_15_STABLE Details --- https://git.postgresql.org/pg/commitdiff/0adf424b490b701ae8eeeca3d9630773f18cd937 Modified Files -- src/backend/replication/slot.c | 7 +++ 1 file changed, 7 insertions(+)
pgsql: Fix serious performance problems in LZ4Stream_read_internal.
Fix serious performance problems in LZ4Stream_read_internal. I was distressed to find that reading an LZ4-compressed toc.dat file was hundreds of times slower than it ought to be. On investigation, the blame mostly affixes to LZ4Stream_read_overflow's habit of memmove'ing all the remaining buffered data after each read operation. Since reading a TOC file tends to involve a lot of small (even one-byte) decompression calls, that amounts to an O(N^2) cost. This could have been fixed with a minimal patch, but to my eyes LZ4Stream_read_internal and LZ4Stream_read_overflow are badly-written spaghetti code; in particular the eol_flag logic is inefficient and duplicative. I chose to throw the code away and rewrite from scratch. This version is about sixty lines shorter as well as not having the performance issue. Fortunately, AFAICT the only way to get to this problem is to manually LZ4-compress the toc.dat and/or blobs.toc files within a directory-style archive; in the main data files, we read blocks that are large enough that the O(N^2) behavior doesn't manifest. Few people do that, which likely explains the lack of field complaints. Otherwise this performance bug might be considered bad enough to warrant back-patching. Author: Tom Lane Reviewed-by: Chao Li Discussion: https://postgr.es/m/[email protected] Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/1f8062dd9668572d66549fc798a7d2057aa34ee1 Modified Files -- src/bin/pg_dump/compress_lz4.c | 242 +++-- 1 file changed, 89 insertions(+), 153 deletions(-)
pgsql: Fix version number calculation for data folder flush in pg_combi
Fix version number calculation for data folder flush in pg_combinebackup The version number calculated by read_pg_version_file() is multiplied once by 1, to be able to do comparisons based on PG_VERSION_NUM or equivalents with a minor version included. However, the version number given sync_pgdata() was multiplied by 1 a second time, leading to an overestimated number. This issue was harmless (still incorrect) as pg_combinebackup does not support versions of Postgres older than v10, and sync_pgdata() only includes a version check due to the rename of pg_xlog/ to pg_wal/. This folder rename happened in the development cycle of v10. This would become a problem if in the future sync_pgdata() is changed to have more version-specific checks. Oversight in dc212340058b, so backpatch down to v17. Reviewed-by: Chao Li Discussion: https://postgr.es/m/[email protected] Backpatch-through: 17 Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/1c05fe11abb6a1588b158e3f39b668053e24cdae Modified Files -- src/bin/pg_combinebackup/pg_combinebackup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
pgsql: Fix incorrect message-printing in win32security.c.
Fix incorrect message-printing in win32security.c. log_error() would probably fail completely if used, and would certainly print garbage for anything that needed to be interpolated into the message, because it was failing to use the correct printing subroutine for a va_list argument. This bug likely went undetected because the error cases this code is used for are rarely exercised - they only occur when Windows security API calls fail catastrophically (out of memory, security subsystem corruption, etc). The FRONTEND variant can be fixed just by calling vfprintf() instead of fprintf(). However, there was no va_list variant of write_stderr(), so create one by refactoring that function. Following the usual naming convention for such things, call it vwrite_stderr(). Author: Bryan Green Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=tzam+tw...@mail.gmail.com Backpatch-through: 13 Branch -- REL_16_STABLE Details --- https://git.postgresql.org/pg/commitdiff/9883e3cd1c61b1bf23d655e62789ff56c3da3bb5 Modified Files -- src/backend/utils/error/elog.c | 16 +--- src/include/utils/elog.h | 1 + src/port/win32security.c | 4 ++-- 3 files changed, 16 insertions(+), 5 deletions(-)
pgsql: Doc: clarify n_distinct_inherited setting
Doc: clarify n_distinct_inherited setting There was some confusion around how to adjust the n_distinct estimates for partitioned tables. Here we try and clarify that n_distinct_inherited needs to be adjusted rather than n_distinct. Also fix some slightly misleading text which was talking about table size rather than table rows, fix a grammatical error, and adjust some text which indicated that ANALYZE was performing calculations based on the n_distinct settings. Really it's the query planner that does this and ANALYZE only stores the overridden n_distinct estimate value in pg_statistic. Author: David Rowley Reviewed-by: David G. Johnston Reviewed-by: Chao Li Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-z...@mail.gmail.com Branch -- REL_16_STABLE Details --- https://git.postgresql.org/pg/commitdiff/0a47f054ef8d5b9f2d9702ba1a558953fc7c Modified Files -- doc/src/sgml/ref/alter_table.sgml | 34 -- 1 file changed, 16 insertions(+), 18 deletions(-)
pgsql: Fix issue with reading zero bytes in Gzip_read.
Fix issue with reading zero bytes in Gzip_read. pg_dump expects a read request of zero bytes to be a no-op; see for example ReadStr(). Gzip_read got this wrong and falsely supposed that the resulting gzret == 0 indicated an error. We could complicate that error-checking logic some more, but it seems best to just fall out immediately when passed size == 0. This bug breaks the nominally-supported case of manually gzip'ing the toc.dat file within a directory-style dump, so back-patch to v16 where this code came in. (Prior branches already have a short-circuit for size == 0 before their only gzread call.) Author: Tom Lane Reviewed-by: Chao Li Discussion: https://postgr.es/m/[email protected] Backpatch-through: 16 Branch -- REL_16_STABLE Details --- https://git.postgresql.org/pg/commitdiff/1518b7d76aadcbdffa6214555b82b995e7404b38 Modified Files -- src/bin/pg_dump/compress_gzip.c | 4 1 file changed, 4 insertions(+)
pgsql: Fix poor buffering logic in pg_dump's lz4 and zstd compression c
Fix poor buffering logic in pg_dump's lz4 and zstd compression code. Both of these modules dumped each bit of output that they got from the underlying compression library as a separate "data block" in the emitted archive file. In the case of zstd this'd frequently result in block sizes well under 100 bytes; lz4 is a little better but still produces blocks around 300 bytes, at least in the test case I tried. This bloats the archive file a little bit compared to larger block sizes, but the real problem is that when pg_restore has to skip each data block rather than seeking directly to some target data, tiny block sizes are enormously inefficient. Fix both modules so that they fill their allocated buffer reasonably well before dumping a data block. In the case of lz4, also delete some redundant logic that caused the lz4 frame header to be emitted as a separate data block. (That saves little, but I see no reason to expend extra code to get worse results.) I fixed the "stream API" code too. In those cases, feeding small amounts of data to fwrite() probably doesn't have any meaningful performance consequences. But it seems like a bad idea to leave the two sets of code doing the same thing in two different ways. In passing, remove unnecessary "extra paranoia" check in _ZstdWriteCommon. _CustomWriteFunc (the only possible referent of cs->writeF) already protects itself against zero-length writes, and it's really a modularity violation for _ZstdWriteCommon to know that the custom format disallows empty data blocks. Also, fix Zstd_read_internal to do less work when passed size == 0. Reported-by: Dimitrios Apostolou Author: Tom Lane Reviewed-by: Chao Li Discussion: https://postgr.es/m/[email protected] Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/fe8192a95e6c7159d639e341740e32966c9cf385 Modified Files -- src/bin/pg_dump/compress_lz4.c | 167 +++- src/bin/pg_dump/compress_zstd.c | 42 +- 2 files changed, 118 insertions(+), 91 deletions(-)
pgsql: Doc: clarify n_distinct_inherited setting
Doc: clarify n_distinct_inherited setting There was some confusion around how to adjust the n_distinct estimates for partitioned tables. Here we try and clarify that n_distinct_inherited needs to be adjusted rather than n_distinct. Also fix some slightly misleading text which was talking about table size rather than table rows, fix a grammatical error, and adjust some text which indicated that ANALYZE was performing calculations based on the n_distinct settings. Really it's the query planner that does this and ANALYZE only stores the overridden n_distinct estimate value in pg_statistic. Author: David Rowley Reviewed-by: David G. Johnston Reviewed-by: Chao Li Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-z...@mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/615a0fc2f1fd30df863b0dde2c045eaab8018ec6 Modified Files -- doc/src/sgml/ref/alter_table.sgml | 34 -- 1 file changed, 16 insertions(+), 18 deletions(-)
