pgsql: Add macro to disable address safety instrumentation

2024-04-05 Thread John Naylor
Add macro to disable address safety instrumentation

fasthash_accum_cstring_aligned() uses a technique, found in various
strlen() implementations, to detect a string's NUL terminator by
reading a word at at time. That triggers failures when testing with
"-fsanitize=address", at least with frontend code. To enable using
this function anywhere, add a function attribute macro to disable
such testing.

Reviewed by Jeff Davis

Discussion: 
https://postgr.es/m/CANWCAZbwvp7oUEkbw-xP4L0_S_WNKq-J-ucP4RCNDPJnrakUPw%40mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/db17594ad73a871a176a9bf96e0589c2cf57052c

Modified Files
--
src/include/c.h  | 13 +
src/include/common/hashfn_unstable.h |  5 -
2 files changed, 17 insertions(+), 1 deletion(-)



pgsql: Fix incorrect return type

2024-04-05 Thread John Naylor
Fix incorrect return type

fasthash32() calculates a 32-bit hashcode, but the return
type was uint64. Change to uint32.

Noted by Jeff Davis

Discussion: 
https://postgr.es/m/b16c93e6c736a422d4de668343515375664eb05d.camel%40j-davis.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/4b968e2027ba46b31be0a648486f86a2cadc707d

Modified Files
--
src/include/common/hashfn_unstable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)



pgsql: Convert uses of hash_string_pointer to fasthash equivalent

2024-04-05 Thread John Naylor
Convert uses of hash_string_pointer to fasthash equivalent

Remove duplicate hash_string_pointer() function definitions by creating
a new inline function hash_string() for this purpose.

This has the added advantage of avoiding strlen() calls when doing hash
lookup. It's not clear how many of these are perfomance-sensitive
enough to benefit from that, but the simplification is worth it on
its own.

Reviewed by Jeff Davis

Discussion: 
https://postgr.es/m/CANWCAZbg_XeSeY0a_PqWmWqeRATvzTzUNYRLeT%2Bbzs%2BYQdC92g%40mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/f956ecd0353b2960f8322b2211142113fe2b6f67

Modified Files
--
src/bin/pg_combinebackup/load_manifest.c  | 16 ++--
src/bin/pg_dump/pg_dumpall.c  | 17 ++---
src/bin/pg_rewind/filemap.c   | 17 ++---
src/bin/pg_verifybackup/pg_verifybackup.c | 16 ++--
src/include/common/hashfn_unstable.h  | 20 
5 files changed, 28 insertions(+), 58 deletions(-)



pgsql: Improve read_stream.c's fast path.

2024-04-05 Thread Thomas Munro
Improve read_stream.c's fast path.

The "fast path" for well cached scans that don't do any I/O was
accidentally coded in a way that could only be triggered by pg_prewarm's
usage pattern, which starts out with a higher distance because of the
flags it passes in.  We want it to work for streaming sequential scans
too, once that patch is committed.  Adjust.

Reviewed-by: Melanie Plageman 
Discussion: 
https://postgr.es/m/CA%2BhUKGKXZALJ%3D6aArUsXRJzBm%3Dqvc4AWp7%3DiJNXJQqpbRLnD_w%40mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/aa1e8c206454d666ab600050022aef2c3c390f69

Modified Files
--
src/backend/storage/aio/read_stream.c | 75 +++
1 file changed, 31 insertions(+), 44 deletions(-)



pgsql: Fix headerscheck violation introduced in f8ce4ed78ca

2024-04-05 Thread Andres Freund
Fix headerscheck violation introduced in f8ce4ed78ca

Per ci.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/9e7386924e837aef8d48895cf72a6a0b7f78cbe9

Modified Files
--
src/bin/pg_combinebackup/reconstruct.h | 1 +
1 file changed, 1 insertion(+)



pgsql: Silence some compiler warnings in commit 3311ea86ed

2024-04-05 Thread Andrew Dunstan
Silence some compiler warnings in commit 3311ea86ed

Per report from Nathan Bossart

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/c3e60f3d7e4268c6448ec592053b3b893899867d

Modified Files
--
src/common/jsonapi.c | 7 +++
1 file changed, 7 insertions(+)



pgsql: Fix incorrect calculation in BlockRefTableEntryGetBlocks.

2024-04-05 Thread Robert Haas
Fix incorrect calculation in BlockRefTableEntryGetBlocks.

The previous formula was incorrect in the case where the function's
nblocks argument was a multiple of BLOCKS_PER_CHUNK, which happens
whenever a relation segment file is exactly 512MB or exactly 1GB in
length. In such cases, the formula would calculate a stop_offset of
0 rather than 65536, resulting in modified blocks in the second half
of a 1GB file, or all the modified blocks in a 512MB file, being
omitted from the incremental backup.

Reported off-list by Tomas Vondra and Jakub Wartak.

Discussion: 
http://postgr.es/m/CA+TgmoYwy_KHp1-5GYNmVa=zdejwhnh1t0sbmeuvqqnjehj...@mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/55a5ee30cd65886ff0a2e7ffef4ec2816fbec273

Modified Files
--
src/common/blkreftable.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)



pgsql: Check HAVE_COPY_FILE_RANGE before calling copy_file_range

2024-04-05 Thread Tomas Vondra
Check HAVE_COPY_FILE_RANGE before calling copy_file_range

Fix a mistake in ac8110155132 - write_reconstructed_file() called
copy_file_range() without properly checking HAVE_COPY_FILE_RANGE.

Reported by several macOS machines. Also reported by cfbot, but I missed
that issue before commit.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/079d94ab3412fdeda637a24b17bde53c576b0007

Modified Files
--
src/bin/pg_combinebackup/reconstruct.c | 4 
1 file changed, 4 insertions(+)



pgsql: Allow using copy_file_range in write_reconstructed_file

2024-04-05 Thread Tomas Vondra
Allow using copy_file_range in write_reconstructed_file

This commit allows using copy_file_range() for efficient combining of
data from multiple files, instead of simply reading/writing the blocks.
Depending on the filesystem and other factors (size of the increment,
distribution of modified blocks etc.) this may be faster than the
block-by-block copy, but more importantly it enables various features
provided by CoW filesystems.

If a checksum needs to be calculated for the file, the same strategy as
when copying whole files is used - copy_file_range is used to copy the
blocks, but the file is also read for the checksum calculation.

While the checksum calculation is rarely needed when cloning whole
files, when reconstructing the files from multiple backups it needs to
happen almost always (the only exception is when the user specified
--no-manifest).

Author: Tomas Vondra
Reviewed-by: Thomas Munro, Jakub Wartak, Robert Haas
Discussion: 
https://postgr.es/m/3024283a-7491-4240-80d0-421575f6bb23%40enterprisedb.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/ac81101551326ddf4c5cb804c75bd3e8c56506ba

Modified Files
--
src/bin/pg_combinebackup/reconstruct.c | 134 ++---
1 file changed, 106 insertions(+), 28 deletions(-)



pgsql: Make libpqsrv_cancel's return const char *, not char *

2024-04-05 Thread Alvaro Herrera
Make libpqsrv_cancel's return const char *, not char *

Per headerscheck's C++ check.

Discussion: https://postgr.es/m/372769.1712179...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/b8b37e41ba4aae1e79dcfaeb9eb0fd7549773ff5

Modified Files
--
contrib/dblink/dblink.c | 2 +-
contrib/postgres_fdw/connection.c   | 2 +-
src/include/libpq/libpq-be-fe-helpers.h | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)



pgsql: Remove unused variable in checksum_file()

2024-04-05 Thread Tomas Vondra
Remove unused variable in checksum_file()

The 'offset' variable was set but otherwise unused.

Per buildfarm animals with clang, e.g. sifaka and longlin.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/8e392595e5619734db707553e8f72dfacf9ab86c

Modified Files
--
src/bin/pg_combinebackup/copy_file.c | 3 ---
1 file changed, 3 deletions(-)



pgsql: Allow copying files using clone/copy_file_range

2024-04-05 Thread Tomas Vondra
Allow copying files using clone/copy_file_range

Adds --clone/--copy-file-range options to pg_combinebackup, to allow
copying files using file cloning or copy_file_range(). These methods may
be faster than the standard block-by-block copy, but the main advantage
is that they enable various features provided by CoW filesystems.

This commit only uses these copy methods for files that did not change
and can be copied as a whole from a single backup.

These new copy methods may not be available on all platforms, in which
case the command throws an error (immediately, even if no files would be
copied as a whole). This early failure seems better than failing later
when trying to copy the first file, after performing a lot of work on
earlier files.

If the requested copy method is available, but a checksum needs to be
recalculated (e.g. because of a different checksum type), the file is
still copied using the requested method, but it is also read for the
checksum calculation. Depending on the filesystem this may be more
expensive than just performing the simple copy, but it does enable the
CoW benefits.

Initial patch by Jakub Wartak, various reworks and improvements by me.

Author: Tomas Vondra, Jakub Wartak
Reviewed-by: Thomas Munro, Jakub Wartak, Robert Haas
Discussion: 
https://postgr.es/m/3024283a-7491-4240-80d0-421575f6bb23%40enterprisedb.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/f8ce4ed78ca6e35bf135545e34bd49cd65d88ea2

Modified Files
--
doc/src/sgml/ref/pg_combinebackup.sgml  |  45 ++
src/bin/pg_combinebackup/copy_file.c| 206 +++-
src/bin/pg_combinebackup/copy_file.h|  18 ++-
src/bin/pg_combinebackup/pg_combinebackup.c |  45 +-
src/bin/pg_combinebackup/reconstruct.c  |   3 +-
src/bin/pg_combinebackup/reconstruct.h  |   1 +
src/tools/pgindent/typedefs.list|   1 +
7 files changed, 278 insertions(+), 41 deletions(-)



pgsql: Suppress "variable may be used uninitialized" warning.

2024-04-05 Thread Tom Lane
Suppress "variable may be used uninitialized" warning.

Buildfarm member caiman is showing this, which surprises me because
it's very late-model gcc (14.0.1) and ought to be smart enough to
know that elog(ERROR) doesn't return.  But we're likely to see the
same from stupider compilers too, so add a dummy initialization in
our usual style.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/3c5ff36aba8a3df19967d0ddc1249c63417bb9b0

Modified Files
--
src/backend/parser/parse_expr.c | 1 +
1 file changed, 1 insertion(+)



pgsql: docs: Merge separate chapters on built-in index AMs into one.

2024-04-05 Thread Robert Haas
docs: Merge separate chapters on built-in index AMs into one.

The documentation index is getting very long, which makes it hard
to find things. Since these chapters are all very similar in structure
and content, merging them is a natural way of reducing the size of
the toplevel index.

Rather than actually combining all of the SGML into a single file,
keep one file per , and add a glue file that includes all
of them.

Discussion: 
http://postgr.es/m/CA+Tgmob7_uoYuS2=rvwpvxarwp-uxz+++saytc-bcz42qzs...@mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/fe8eaa54420cbb384bd5ad179495bb9774b6b48f

Modified Files
--
doc/src/sgml/brin.sgml   | 22 +++---
doc/src/sgml/btree.sgml  | 32 
doc/src/sgml/filelist.sgml   |  1 +
doc/src/sgml/gin.sgml| 40 
doc/src/sgml/gist.sgml   | 28 ++--
doc/src/sgml/hash.sgml   | 12 ++--
doc/src/sgml/indextypes.sgml | 13 +
doc/src/sgml/postgres.sgml   |  7 +--
doc/src/sgml/spgist.sgml | 36 ++--
9 files changed, 100 insertions(+), 91 deletions(-)



pgsql: Align blocks in incremental backups to BLCKSZ

2024-04-05 Thread Tomas Vondra
Align blocks in incremental backups to BLCKSZ

Align blocks stored in incremental files to BLCKSZ, so that the
incremental backups work well with CoW filesystems.

The header of the incremental file is padded with \0 to a multiple of
BLCKSZ, so that the block data (also BLCKSZ) is aligned to BLCKSZ. The
padding is added only to files containing block data, so files with just
the header remain small. This adds a bit of extra space, but as the
number of blocks increases the overhead gets negligible very quickly.
And as the padding is \0 bytes, it does compress extremely well.

The alignment is important for CoW filesystems that usually require the
blocks to be aligned to filesystem page size for features like block
sharing, deduplication etc. to work well. With the variable sized header
the blocks in the increments were not aligned at all, negating the
benefits of the CoW filesystems.

This matters even for non-CoW filesystems, for example when placed on a
RAID array. If the block is not aligned, it may easily span multiple
devices, causing read and write amplification.

It might be better to align the blocks to the filesystem page, not
BLCKSZ, but we have no good way to determine that. Even if we determine
the page size at the time of taking the backup, the backup may move. For
now the BLCKSZ seems sufficient - the filesystem page is usually 4K, so
the default BLCKSZ (8K by default) is aligned to that.

Author: Tomas Vondra
Reviewed-by: Robert Haas, Jakub Wartak
Discussion: 
https://postgr.es/m/3024283a-7491-4240-80d0-421575f6bb23%40enterprisedb.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/10e3226ba13d184fc3165138c619eb7f2d52cdd2

Modified Files
--
src/backend/backup/basebackup.c | 26 +++
src/backend/backup/basebackup_incremental.c | 39 ++---
src/bin/pg_combinebackup/reconstruct.c  |  8 ++
src/include/backup/basebackup_incremental.h |  1 +
4 files changed, 70 insertions(+), 4 deletions(-)



pgsql: Operate XLogCtl->log{Write,Flush}Result with atomics

2024-04-05 Thread Alvaro Herrera
Operate XLogCtl->log{Write,Flush}Result with atomics

This removes the need to hold both the info_lck spinlock and
WALWriteLock to update them.  We use stock atomic write instead, with
WALWriteLock held.  Readers can use atomic read, without any locking.

This allows for some code to be reordered: some places were a bit
contorted to avoid repeated spinlock acquisition, but that's no longer a
concern, so we can turn them to more natural coding.  Some further
changes are possible (maybe to performance wins), but in this commit I
did rather minimal ones only, to avoid increasing the blast radius.

Reviewed-by: Bharath Rupireddy 
Reviewed-by: Jeff Davis 
Reviewed-by: Andres Freund  (earlier versions)
Discussion: https://postgr.es/m/20200831182156.GA3983@alvherre.pgsql

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/ee1cbe806dad47674ded35427c6ba217531847d6

Modified Files
--
src/backend/access/transam/xlog.c | 107 +-
1 file changed, 59 insertions(+), 48 deletions(-)