On Wed, Mar 1, 2023 at 9:45 AM Nathan Bossart <nathandboss...@gmail.com> wrote:
>
> On Tue, Feb 28, 2023 at 10:38:31AM +0530, Bharath Rupireddy wrote:
> > On Tue, Feb 28, 2023 at 6:14 AM Nathan Bossart <nathandboss...@gmail.com> 
> > wrote:
> >> Why do we only read a page at a time in XLogReadFromBuffersGuts()?  What is
> >> preventing us from copying all the data we need in one go?
> >
> > Note that most of the WALRead() callers request a single page of
> > XLOG_BLCKSZ bytes even if the server has less or more available WAL
> > pages. It's the streaming replication wal sender that can request less
> > than XLOG_BLCKSZ bytes and upto MAX_SEND_SIZE (16 * XLOG_BLCKSZ). And,
> > if we read, say, MAX_SEND_SIZE at once while holding
> > WALBufMappingLock, that might impact concurrent inserters (at least, I
> > can say it in theory) - one of the main intentions of this patch is
> > not to impact inserters much.
>
> Perhaps we should test both approaches to see if there is a noticeable
> difference.  It might not be great for concurrent inserts to repeatedly
> take the lock, either.  If there's no real difference, we might be able to
> simplify the code a bit.

I took a stab at this - acquire WALBufMappingLock separately for each
requested WAL buffer page vs acquire WALBufMappingLock once for all
requested WAL buffer pages. I chose the pgbench tpcb-like benchmark
that has 3 UPDATE statements and 1 INSERT statement. I ran pgbench for
30min with scale factor 100 and 4096 clients with primary and 1 async
standby, see [1]. I captured wait_events to see the contention on
WALBufMappingLock. I haven't noticed any contention on the lock and no
difference in TPS too, see [2] for results on HEAD, see [3] for
results on v6 patch which has "acquire WALBufMappingLock separately
for each requested WAL buffer page" strategy and see [4] for results
on v7 patch (attached herewith) which has "acquire WALBufMappingLock
once for all requested WAL buffer pages" strategy. Another thing to
note from the test results is that reduction in WALRead IO wait events
from 136 on HEAD to 1 on v6 or v7 patch. So, the read from WAL buffers
is really  helping here.

With these observations, I'd like to use the approach that acquires
WALBufMappingLock once for all requested WAL buffer pages unlike v6
and the previous patches.

I'm attaching the v7 patch set with this change for further review.

[1]
shared_buffers = '8GB'
wal_buffers = '1GB'
max_wal_size = '16GB'
max_connections = '5000'
archive_mode = 'on'
archive_command='cp %p /home/ubuntu/archived_wal/%f'
./pgbench --initialize --scale=100 postgres
./pgbench -n -M prepared -U ubuntu postgres -b tpcb-like -c4096 -j4096 -T1800

[2]
HEAD:
done in 20.03 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 15.53 s, vacuum 0.19 s, primary keys 4.30 s).
tps = 11654.475345 (without initial connection time)

50950253  Lock            | transactionid
16472447  Lock            | tuple
3869523  LWLock          | LockManager
 739283  IPC             | ProcArrayGroupUpdate
 718549                  |
 439877  LWLock          | WALWrite
 130737  Client          | ClientRead
 121113  LWLock          | BufferContent
  70778  LWLock          | WALInsert
  43346  IPC             | XactGroupUpdate
  18547
  18546  Activity        | LogicalLauncherMain
  18545  Activity        | AutoVacuumMain
  18272  Activity        | ArchiverMain
  17627  Activity        | WalSenderMain
  17207  Activity        | WalWriterMain
  15455  IO              | WALSync
  14963  LWLock          | ProcArray
  14747  LWLock          | XactSLRU
  13943  Timeout         | CheckpointWriteDelay
  10519  Activity        | BgWriterHibernate
   8022  Activity        | BgWriterMain
   4486  Timeout         | SpinDelay
   4443  Activity        | CheckpointerMain
   1435  Lock            | extend
    670  LWLock          | XidGen
    373  IO              | WALWrite
    283  Timeout         | VacuumDelay
    268  IPC             | ArchiveCommand
    249  Timeout         | VacuumTruncate
    136  IO              | WALRead
    115  IO              | WALInitSync
     74  IO              | DataFileWrite
     67  IO              | WALInitWrite
     36  IO              | DataFileFlush
     35  IO              | DataFileExtend
     17  IO              | DataFileRead
      4  IO              | SLRUWrite
      3  IO              | BufFileWrite
      2  IO              | DataFileImmediateSync
      1 Tuples only is on.
      1  LWLock          | SInvalWrite
      1  LWLock          | LockFastPath
      1  IO              | ControlFileSyncUpdate

[3]
done in 19.99 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 15.52 s, vacuum 0.18 s, primary keys 4.28 s).
tps = 11689.584538 (without initial connection time)

50678977  Lock            | transactionid
16252048  Lock            | tuple
4146827  LWLock          | LockManager
 768256                  |
 719923  IPC             | ProcArrayGroupUpdate
 432836  LWLock          | WALWrite
 140354  Client          | ClientRead
 124203  LWLock          | BufferContent
  74355  LWLock          | WALInsert
  39852  IPC             | XactGroupUpdate
  30728
  30727  Activity        | LogicalLauncherMain
  30726  Activity        | AutoVacuumMain
  30420  Activity        | ArchiverMain
  29881  Activity        | WalSenderMain
  29418  Activity        | WalWriterMain
  23428  Activity        | BgWriterHibernate
  15960  Timeout         | CheckpointWriteDelay
  15840  IO              | WALSync
  15066  LWLock          | ProcArray
  14577  Activity        | CheckpointerMain
  14377  LWLock          | XactSLRU
   7291  Activity        | BgWriterMain
   4336  Timeout         | SpinDelay
   1707  Lock            | extend
    720  LWLock          | XidGen
    362  Timeout         | VacuumTruncate
    360  IO              | WALWrite
    304  Timeout         | VacuumDelay
    301  IPC             | ArchiveCommand
    106  IO              | WALInitSync
     82  IO              | DataFileWrite
     66  IO              | WALInitWrite
     45  IO              | DataFileFlush
     25  IO              | DataFileExtend
     18  IO              | DataFileRead
      5  LWLock          | LockFastPath
      2  IO              | DataFileSync
      2  IO              | DataFileImmediateSync
      1 Tuples only is on.
      1  LWLock          | BufferMapping
      1  IO              | WALRead
      1  IO              | SLRUWrite
      1  IO              | SLRURead
      1  IO              | ReplicationSlotSync
      1  IO              | BufFileRead

[4]
done in 19.92 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 15.53 s, vacuum 0.23 s, primary keys 4.16 s).
tps = 11671.869074 (without initial connection time)

50614021  Lock            | transactionid
16482561  Lock            | tuple
4086451  LWLock          | LockManager
 777507                  |
 714329  IPC             | ProcArrayGroupUpdate
 420593  LWLock          | WALWrite
 138142  Client          | ClientRead
 125381  LWLock          | BufferContent
  75283  LWLock          | WALInsert
  38759  IPC             | XactGroupUpdate
  20283
  20282  Activity        | LogicalLauncherMain
  20281  Activity        | AutoVacuumMain
  20002  Activity        | ArchiverMain
  19467  Activity        | WalSenderMain
  19036  Activity        | WalWriterMain
  15836  IO              | WALSync
  15708  Timeout         | CheckpointWriteDelay
  15346  LWLock          | ProcArray
  15095  LWLock          | XactSLRU
  11852  Activity        | BgWriterHibernate
   8424  Activity        | BgWriterMain
   4636  Timeout         | SpinDelay
   4415  Activity        | CheckpointerMain
   2048  Lock            | extend
   1457  Timeout         | VacuumTruncate
    646  LWLock          | XidGen
    402  IO              | WALWrite
    306  Timeout         | VacuumDelay
    278  IPC             | ArchiveCommand
    117  IO              | WALInitSync
     74  IO              | DataFileWrite
     66  IO              | WALInitWrite
     35  IO              | DataFileFlush
     29  IO              | DataFileExtend
     24  LWLock          | LockFastPath
     14  IO              | DataFileRead
      2  IO              | SLRUWrite
      2  IO              | DataFileImmediateSync
      2  IO              | BufFileWrite
      1 Tuples only is on.
      1  LWLock          | BufferMapping
      1  IO              | WALRead
      1  IO              | SLRURead
      1  IO              | BufFileRead

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 2c46ebcb95954580da3ece4bd8ce5d5b1d824694 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 3 Mar 2023 10:33:06 +0000
Subject: [PATCH v7] Improve WALRead() to suck data directly from WAL buffers

---
 src/backend/access/transam/xlog.c       | 140 ++++++++++++++++++++++++
 src/backend/access/transam/xlogreader.c |  45 +++++++-
 src/include/access/xlog.h               |   6 +
 3 files changed, 189 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 87af608d15..51dd101d12 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1639,6 +1639,146 @@ GetXLogBuffer(XLogRecPtr ptr, TimeLineID tli)
 	return cachedPos + ptr % XLOG_BLCKSZ;
 }
 
+/*
+ * Read WAL from WAL buffers.
+ *
+ * Read 'count' bytes of WAL from WAL buffers into 'buf', starting at location
+ * 'startptr', on timeline 'tli' and set the read bytes to 'read_bytes'.
+ *
+ * Note that this function reads as much as it can from WAL buffers, meaning,
+ * it may not read all the requested 'count' bytes. The caller must be aware of
+ * this and deal with it.
+ */
+void
+XLogReadFromBuffers(XLogRecPtr startptr,
+					TimeLineID tli,
+					Size count,
+					char *buf,
+					Size *read_bytes)
+{
+	XLogRecPtr	ptr;
+	char    *dst;
+	Size    nbytes;
+
+	Assert(!XLogRecPtrIsInvalid(startptr));
+	Assert(count > 0);
+	Assert(startptr <= GetFlushRecPtr(NULL));
+	Assert(!RecoveryInProgress());
+	Assert(tli == GetWALInsertionTimeLine());
+
+	ptr = startptr;
+	nbytes = count;
+	dst = buf;
+	*read_bytes = 0;
+
+	/*
+	 * Holding WALBufMappingLock ensures inserters don't overwrite this value
+	 * while we are reading it. We try to acquire it in shared mode so that the
+	 * concurrent WAL readers are also allowed. We try to do as less work as
+	 * possible while holding the lock to not impact concurrent WAL writers
+	 * much. We quickly exit to not cause any contention, if the lock isn't
+	 * immediately available.
+	 */
+	if (!LWLockConditionalAcquire(WALBufMappingLock, LW_SHARED))
+		return;
+
+	while (nbytes > 0)
+	{
+		XLogRecPtr origptr;
+		XLogRecPtr	expectedEndPtr;
+		XLogRecPtr	endptr;
+		int 	idx;
+
+		origptr = ptr;
+		idx = XLogRecPtrToBufIdx(ptr);
+		expectedEndPtr = ptr;
+		expectedEndPtr += XLOG_BLCKSZ - ptr % XLOG_BLCKSZ;
+
+		endptr = XLogCtl->xlblocks[idx];
+
+		if (expectedEndPtr == endptr)
+		{
+			char	*page;
+			char    *data;
+			XLogPageHeader	phdr;
+
+			/*
+			 * We found WAL buffer page containing given XLogRecPtr. Get
+			 * starting address of the page and a pointer to the right location
+			 * of given XLogRecPtr in that page.
+			 */
+			page = XLogCtl->pages + idx * (Size) XLOG_BLCKSZ;
+			data = page + ptr % XLOG_BLCKSZ;
+
+			/* Read what is wanted, not the whole page. */
+			if ((data + nbytes) <= (page + XLOG_BLCKSZ))
+			{
+				/* All the bytes are in one page. */
+				memcpy(dst, data, nbytes);
+				*read_bytes += nbytes;
+				nbytes = 0;
+			}
+			else
+			{
+				Size	nread;
+
+				/*
+				 * All the bytes are not in one page. Read available bytes on
+				 * the current page, copy them over to output buffer and
+				 * continue to read remaining bytes.
+				 */
+				nread = XLOG_BLCKSZ - (data - page);
+				Assert(nread > 0 && nread <= nbytes);
+				memcpy(dst, data, nread);
+				ptr += nread;
+				nbytes -= nread;
+				dst += nread;
+				*read_bytes += nread;
+			}
+
+
+			/*
+			 * The fact that we acquire WALBufMappingLock while reading the WAL
+			 * buffer page itself guarantees that no one else initializes it or
+			 * makes it ready for next use in AdvanceXLInsertBuffer().
+			 *
+			 * However, we perform basic page header checks for ensuring that
+			 * we are not reading a page that just got initialized. Callers
+			 * will anyway perform extensive page-level and record-level
+			 * checks.
+			 */
+			phdr = (XLogPageHeader) page;
+
+			if (!(phdr->xlp_magic == XLOG_PAGE_MAGIC &&
+				  phdr->xlp_pageaddr == (origptr - (origptr % XLOG_BLCKSZ)) &&
+				  phdr->xlp_tli == tli))
+			{
+				/*
+				 * WAL buffer page doesn't look valid, so return with what we
+				 * have read so far.
+				 */
+				break;
+			}
+		}
+		else
+		{
+			/*
+			 * Requested WAL isn't available in WAL buffers, so return with
+			 * what we have read so far.
+			 */
+			break;
+		}
+	}
+
+	LWLockRelease(WALBufMappingLock);
+
+	/* We never read more than what the caller has asked for. */
+	Assert(*read_bytes <= count);
+
+	elog(DEBUG1, "read %zu bytes out of %zu bytes from WAL buffers for given LSN %X/%X, Timeline ID %u",
+		 *read_bytes, count, LSN_FORMAT_ARGS(startptr), tli);
+}
+
 /*
  * Converts a "usable byte position" to XLogRecPtr. A usable byte position
  * is the position starting from the beginning of WAL, excluding all WAL
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cadea21b37..bd11df448a 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1486,8 +1486,7 @@ err:
  * Returns true if succeeded, false if an error occurs, in which case
  * 'errinfo' receives error details.
  *
- * XXX probably this should be improved to suck data directly from the
- * WAL buffers when possible.
+ * When possible, this function reads data directly from WAL buffers.
  */
 bool
 WALRead(XLogReaderState *state,
@@ -1498,6 +1497,48 @@ WALRead(XLogReaderState *state,
 	XLogRecPtr	recptr;
 	Size		nbytes;
 
+#ifndef FRONTEND
+	/* Frontend tools have no idea of WAL buffers. */
+	Size        read_bytes;
+
+	/*
+	 * When possible, read WAL from WAL buffers. We skip this step and continue
+	 * the usual way, that is to read from WAL file, either when server is in
+	 * recovery (standby mode, archive or crash recovery), in which case the
+	 * WAL buffers are not used or when the server is inserting in a different
+	 * timeline from that of the timeline that we're trying to read WAL from.
+	 */
+	if (!RecoveryInProgress() &&
+		tli == GetWALInsertionTimeLine())
+	{
+		XLogReadFromBuffers(startptr, tli, count, buf, &read_bytes);
+
+		/*
+		 * Check if we have read fully (hit), partially (partial hit) or
+		 * nothing (miss) from WAL buffers. If we have read either partially or
+		 * nothing, then continue to read the remaining bytes the usual way,
+		 * that is, read from WAL file.
+		 */
+		if (count == read_bytes)
+		{
+			/* Buffer hit, so return. */
+			return true;
+		}
+		else if (read_bytes > 0 && count > read_bytes)
+		{
+			/*
+			 * Buffer partial hit, so reset the state to count the read bytes
+			 * and continue.
+			 */
+			buf += read_bytes;
+			startptr += read_bytes;
+			count -= read_bytes;
+		}
+
+		/* Buffer miss i.e., read_bytes = 0, so continue */
+	}
+#endif	/* FRONTEND */
+
 	p = buf;
 	recptr = startptr;
 	nbytes = count;
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index cfe5409738..c9941aa001 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -247,6 +247,12 @@ extern XLogRecPtr GetLastImportantRecPtr(void);
 
 extern void SetWalWriterSleeping(bool sleeping);
 
+extern void XLogReadFromBuffers(XLogRecPtr startptr,
+								TimeLineID tli,
+								Size count,
+								char *buf,
+								Size *read_bytes);
+
 /*
  * Routines used by xlogrecovery.c to call back into xlog.c during recovery.
  */
-- 
2.34.1

From ad65a3c413720462c6eae0d5ea4c08ce656e582f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 3 Mar 2023 10:33:36 +0000
Subject: [PATCH v7] Add test module for verifying WAL read from WAL buffers

---
 src/test/modules/Makefile                     |  1 +
 src/test/modules/meson.build                  |  1 +
 .../test_wal_read_from_buffers/.gitignore     |  4 ++
 .../test_wal_read_from_buffers/Makefile       | 23 +++++++++
 .../test_wal_read_from_buffers/meson.build    | 36 +++++++++++++
 .../test_wal_read_from_buffers/t/001_basic.pl | 44 ++++++++++++++++
 .../test_wal_read_from_buffers--1.0.sql       | 16 ++++++
 .../test_wal_read_from_buffers.c              | 51 +++++++++++++++++++
 .../test_wal_read_from_buffers.control        |  4 ++
 9 files changed, 180 insertions(+)
 create mode 100644 src/test/modules/test_wal_read_from_buffers/.gitignore
 create mode 100644 src/test/modules/test_wal_read_from_buffers/Makefile
 create mode 100644 src/test/modules/test_wal_read_from_buffers/meson.build
 create mode 100644 src/test/modules/test_wal_read_from_buffers/t/001_basic.pl
 create mode 100644 src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers--1.0.sql
 create mode 100644 src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.c
 create mode 100644 src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.control

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index c629cbe383..ea33361f69 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -33,6 +33,7 @@ SUBDIRS = \
 		  test_rls_hooks \
 		  test_shm_mq \
 		  test_slru \
+		  test_wal_read_from_buffers \
 		  unsafe_tests \
 		  worker_spi
 
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1baa6b558d..e3ffd3538d 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -29,5 +29,6 @@ subdir('test_regex')
 subdir('test_rls_hooks')
 subdir('test_shm_mq')
 subdir('test_slru')
+subdir('test_wal_read_from_buffers')
 subdir('unsafe_tests')
 subdir('worker_spi')
diff --git a/src/test/modules/test_wal_read_from_buffers/.gitignore b/src/test/modules/test_wal_read_from_buffers/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/test_wal_read_from_buffers/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/test_wal_read_from_buffers/Makefile b/src/test/modules/test_wal_read_from_buffers/Makefile
new file mode 100644
index 0000000000..7a09533ec7
--- /dev/null
+++ b/src/test/modules/test_wal_read_from_buffers/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/test_wal_read_from_buffers/Makefile
+
+MODULE_big = test_wal_read_from_buffers
+OBJS = \
+	$(WIN32RES) \
+	test_wal_read_from_buffers.o
+PGFILEDESC = "test_wal_read_from_buffers - test module to verify that WAL can be read from WAL buffers"
+
+EXTENSION = test_wal_read_from_buffers
+DATA = test_wal_read_from_buffers--1.0.sql
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_wal_read_from_buffers
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_wal_read_from_buffers/meson.build b/src/test/modules/test_wal_read_from_buffers/meson.build
new file mode 100644
index 0000000000..40a36edc07
--- /dev/null
+++ b/src/test/modules/test_wal_read_from_buffers/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# FIXME: prevent install during main install, but not during test :/
+
+test_wal_read_from_buffers_sources = files(
+  'test_wal_read_from_buffers.c',
+)
+
+if host_system == 'windows'
+  test_wal_read_from_buffers_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'test_wal_read_from_buffers',
+    '--FILEDESC', 'test_wal_read_from_buffers - test module to verify that WAL can be read from WAL buffers',])
+endif
+
+test_wal_read_from_buffers = shared_module('test_wal_read_from_buffers',
+  test_wal_read_from_buffers_sources,
+  kwargs: pg_mod_args,
+)
+testprep_targets += test_wal_read_from_buffers
+
+install_data(
+  'test_wal_read_from_buffers.control',
+  'test_wal_read_from_buffers--1.0.sql',
+  kwargs: contrib_data_args,
+)
+
+tests += {
+  'name': 'test_wal_read_from_buffers',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_wal_read_from_buffers/t/001_basic.pl b/src/test/modules/test_wal_read_from_buffers/t/001_basic.pl
new file mode 100644
index 0000000000..3448e0bed6
--- /dev/null
+++ b/src/test/modules/test_wal_read_from_buffers/t/001_basic.pl
@@ -0,0 +1,44 @@
+# Copyright (c) 2021-2023, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+
+$node->init;
+
+# Ensure nobody interferes with us so that the WAL in WAL buffers don't get
+# overwritten while running tests.
+$node->append_conf(
+	'postgresql.conf', qq(
+autovacuum = off
+checkpoint_timeout = 1h
+wal_writer_delay = 10000ms
+wal_writer_flush_after = 1GB
+));
+$node->start;
+
+# Setup.
+$node->safe_psql('postgres', 'CREATE EXTENSION test_wal_read_from_buffers');
+
+# Get current insert LSN. After this, we generate some WAL which is guranteed
+# to be in WAL buffers as there is no other WAL generating activity is
+# happening on the server. We then verify if we can read the WAL from WAL
+# buffers using this LSN.
+my $lsn =
+  $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn();');
+
+# Generate minimal WAL so that WAL buffers don't get overwritten.
+$node->safe_psql('postgres',
+	"CREATE TABLE t (c int); INSERT INTO t VALUES (1);");
+
+# Check if WAL is successfully read from WAL buffers.
+my $result = $node->safe_psql('postgres',
+	qq{SELECT test_wal_read_from_buffers('$lsn')});
+is($result, 't', "WAL is successfully read from WAL buffers");
+
+done_testing();
diff --git a/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers--1.0.sql b/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers--1.0.sql
new file mode 100644
index 0000000000..8e89910133
--- /dev/null
+++ b/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers--1.0.sql
@@ -0,0 +1,16 @@
+/* src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_wal_read_from_buffers" to load this file. \quit
+
+--
+-- test_wal_read_from_buffers()
+--
+-- Returns true if WAL data at a given LSN can be read from WAL buffers.
+-- Otherwise returns false.
+--
+CREATE FUNCTION test_wal_read_from_buffers(IN lsn pg_lsn,
+    OUT read_from_buffers bool
+)
+AS 'MODULE_PATHNAME', 'test_wal_read_from_buffers'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.c b/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.c
new file mode 100644
index 0000000000..ca8645101a
--- /dev/null
+++ b/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.c
@@ -0,0 +1,51 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_wal_read_from_buffers.c
+ *		Test code for veryfing WAL read from WAL buffers.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.c
+ * -------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/xlog.h"
+#include "fmgr.h"
+#include "utils/pg_lsn.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * SQL function for verifying that WAL data at a given LSN can be read from WAL
+ * buffers. Returns true if read from WAL buffers, otherwise false.
+ */
+PG_FUNCTION_INFO_V1(test_wal_read_from_buffers);
+Datum
+test_wal_read_from_buffers(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr	lsn;
+	Size	read_bytes;
+	TimeLineID	tli;
+	char	data[XLOG_BLCKSZ] = {0};
+
+	lsn = PG_GETARG_LSN(0);
+
+	if (XLogRecPtrIsInvalid(lsn))
+		PG_RETURN_BOOL(false);
+
+	if (RecoveryInProgress())
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("recovery is in progress"),
+				 errhint("WAL control functions cannot be executed during recovery.")));
+
+	tli = GetWALInsertionTimeLine();
+
+	XLogReadFromBuffers(lsn, tli, XLOG_BLCKSZ, data, &read_bytes);
+
+	PG_RETURN_LSN(read_bytes > 0);
+}
diff --git a/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.control b/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.control
new file mode 100644
index 0000000000..7852b3e331
--- /dev/null
+++ b/src/test/modules/test_wal_read_from_buffers/test_wal_read_from_buffers.control
@@ -0,0 +1,4 @@
+comment = 'Test code for veryfing WAL read from WAL buffers'
+default_version = '1.0'
+module_pathname = '$libdir/test_wal_read_from_buffers'
+relocatable = true
-- 
2.34.1

Reply via email to