Hi,

In the attached patch set, I've added in missing IO operations for
certain IO Paths as well as enumerating in the commit message which IO
Paths and IO Operations are not currently counted and or not possible.

There is a TODO in HandleWalWriterInterrupts() about removing
pgstat_report_wal() since it is immediately before a proc_exit()

I was wondering if LocalBufferAlloc() should increment the counter or if
I should wait until GetLocalBufferStorage() to increment the counter.

I also realized that I am not differentiating between IOPATH_SHARED and
IOPATH_STRATEGY for IOOP_FSYNC. But, given that we don't know what type
of buffer we are fsync'ing by the time we call register_dirty_segment(),
I'm not sure how we would fix this.

On Wed, Jul 6, 2022 at 3:20 PM Andres Freund <and...@anarazel.de> wrote:

> On 2022-07-05 13:24:55 -0400, Melanie Plageman wrote:
> > From 2d089e26236c55d1be5b93833baa0cf7667ba38d Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplage...@gmail.com>
> > Date: Tue, 28 Jun 2022 11:33:04 -0400
> > Subject: [PATCH v22 1/3] Add BackendType for standalone backends
> >
> > All backends should have a BackendType to enable statistics reporting
> > per BackendType.
> >
> > Add a new BackendType for standalone backends, B_STANDALONE_BACKEND (and
> > alphabetize the BackendTypes). Both the bootstrap backend and single
> > user mode backends will have BackendType B_STANDALONE_BACKEND.
> >
> > Author: Melanie Plageman <melanieplage...@gmail.com>
> > Discussion:
> https://www.postgresql.org/message-id/CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU%2BECH4tNwGFNERkZA%40mail.gmail.com
> > ---
> >  src/backend/utils/init/miscinit.c | 17 +++++++++++------
> >  src/include/miscadmin.h           |  5 +++--
> >  2 files changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/backend/utils/init/miscinit.c
> b/src/backend/utils/init/miscinit.c
> > index eb43b2c5e5..07e6db1a1c 100644
> > --- a/src/backend/utils/init/miscinit.c
> > +++ b/src/backend/utils/init/miscinit.c
> > @@ -176,6 +176,8 @@ InitStandaloneProcess(const char *argv0)
> >  {
> >       Assert(!IsPostmasterEnvironment);
> >
> > +     MyBackendType = B_STANDALONE_BACKEND;
>
> Hm. This is used for singleuser mode as well as bootstrap. Should we
> split those? It's not like bootstrap mode really matters for stats, so
> I'm inclined not to.
>
>
I have no opinion currently.
It depends on how commonly you think developers might want separate
bootstrap and single user mode IO stats.


>
> > @@ -375,6 +376,8 @@ BootstrapModeMain(int argc, char *argv[], bool
> check_only)
> >        * out the initial relation mapping files.
> >        */
> >       RelationMapFinishBootstrap();
> > +     // TODO: should this be done for bootstrap?
> > +     pgstat_report_io_ops();
>
> Hm. Not particularly useful, but also not harmful. But we don't need an
> explicit call, because it'll be done at process exit too. At least I
> think, it could be that it's different for bootstrap.
>
>
>
I've removed this and other occurrences which were before proc_exit()
(and thus redundant). (Though I did not explicitly check if it was
different for bootstrap.)


>
> > diff --git a/src/backend/postmaster/autovacuum.c
> b/src/backend/postmaster/autovacuum.c
> > index 2e146aac93..e6dbb1c4bb 100644
> > --- a/src/backend/postmaster/autovacuum.c
> > +++ b/src/backend/postmaster/autovacuum.c
> > @@ -1712,6 +1712,9 @@ AutoVacWorkerMain(int argc, char *argv[])
> >               recentXid = ReadNextTransactionId();
> >               recentMulti = ReadNextMultiXactId();
> >               do_autovacuum();
> > +
> > +             // TODO: should this be done more often somewhere in
> do_autovacuum()?
> > +             pgstat_report_io_ops();
> >       }
>
> Don't think you need all these calls before process exit - it'll happen
> via pgstat_shutdown_hook().
>
> IMO it'd be a good idea to add pgstat_report_io_ops() to
> pgstat_report_vacuum()/analyze(), so that the stats for a longrunning
> autovac worker get updated more regularly.
>

noted and fixed.


>
>
> > diff --git a/src/backend/postmaster/bgwriter.c
> b/src/backend/postmaster/bgwriter.c
> > index 91e6f6ea18..87e4b9e9bd 100644
> > --- a/src/backend/postmaster/bgwriter.c
> > +++ b/src/backend/postmaster/bgwriter.c
> > @@ -242,6 +242,7 @@ BackgroundWriterMain(void)
> >
> >               /* Report pending statistics to the cumulative stats
> system */
> >               pgstat_report_bgwriter();
> > +             pgstat_report_io_ops();
> >
> >               if (FirstCallSinceLastCheckpoint())
> >               {
>
> How about moving the pgstat_report_io_ops() into
> pgstat_report_bgwriter(), pgstat_report_autovacuum() etc? Seems
> unnecessary to have multiple pgstat_* calls in these places.
>
>
>
noted and fixed.


>
> > +/*
> > + * Flush out locally pending IO Operation statistics entries
> > + *
> > + * If nowait is true, this function returns false on lock failure.
> Otherwise
> > + * this function always returns true. Writer processes are mutually
> excluded
> > + * using LWLock, but readers are expected to use change-count protocol
> to avoid
> > + * interference with writers.
> > + *
> > + * If nowait is true, this function returns true if the lock could not
> be
> > + * acquired. Otherwise return false.
> > + *
> > + */
> > +bool
> > +pgstat_flush_io_ops(bool nowait)
> > +{
> > +     PgStat_IOPathOps *dest_io_path_ops;
> > +     PgStatShared_BackendIOPathOps *stats_shmem;
> > +
> > +     PgBackendStatus *beentry = MyBEEntry;
> > +
> > +     if (!have_ioopstats)
> > +             return false;
> > +
> > +     if (!beentry || beentry->st_backendType == B_INVALID)
> > +             return false;
> > +
> > +     stats_shmem = &pgStatLocal.shmem->io_ops;
> > +
> > +     if (!nowait)
> > +             LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
> > +     else if (!LWLockConditionalAcquire(&stats_shmem->lock,
> LW_EXCLUSIVE))
> > +             return true;
>
> Wonder if it's worth making the lock specific to the backend type?
>

I've added another Lock into PgStat_IOPathOps so that each BackendType
can be locked separately. But, I've also kept the lock in
PgStatShared_BackendIOPathOps so that reset_all and snapshot could be
done easily.


>
>
> > +     dest_io_path_ops =
> > +
>  &stats_shmem->stats[backend_type_get_idx(beentry->st_backendType)];
> > +
>
> This could be done before acquiring the lock, right?
>
>
> > +void
> > +pgstat_io_ops_snapshot_cb(void)
> > +{
> > +     PgStatShared_BackendIOPathOps *stats_shmem =
> &pgStatLocal.shmem->io_ops;
> > +     PgStat_IOPathOps *snapshot_ops = pgStatLocal.snapshot.io_path_ops;
> > +     PgStat_IOPathOps *reset_ops;
> > +
> > +     PgStat_IOPathOps *reset_offset = stats_shmem->reset_offset;
> > +     PgStat_IOPathOps reset[BACKEND_NUM_TYPES];
> > +
> > +     pgstat_copy_changecounted_stats(snapshot_ops,
> > +                     &stats_shmem->stats, sizeof(stats_shmem->stats),
> > +                     &stats_shmem->changecount);
>
> This doesn't make sense - with multiple writers you can't use the
> changecount approach (and you don't in the flush part above).
>
>
> > +     LWLockAcquire(&stats_shmem->lock, LW_SHARED);
> > +     memcpy(&reset, reset_offset, sizeof(stats_shmem->stats));
> > +     LWLockRelease(&stats_shmem->lock);
>
> Which then also means that you don't need the reset offset stuff. It's
> only there because with the changecount approach we can't take a lock to
> reset the stats (since there is no lock). With a lock you can just reset
> the shared state.
>

Yes, I believe I have cleaned up all of this embarrassing mess. I use the
lock in PgStatShared_BackendIOPathOps for reset all and snapshot and the
locks in PgStat_IOPathOps for flush.


>
>
> > +void
> > +pgstat_count_io_op(IOOp io_op, IOPath io_path)
> > +{
> > +     PgStat_IOOpCounters *pending_counters =
> &pending_IOOpStats.data[io_path];
> > +     PgStat_IOOpCounters *cumulative_counters =
> > +                     &cumulative_IOOpStats.data[io_path];
>
> the pending_/cumultive_ prefix before an uppercase-first camelcase name
> seems ugly...
>
> > +     switch (io_op)
> > +     {
> > +             case IOOP_ALLOC:
> > +                     pending_counters->allocs++;
> > +                     cumulative_counters->allocs++;
> > +                     break;
> > +             case IOOP_EXTEND:
> > +                     pending_counters->extends++;
> > +                     cumulative_counters->extends++;
> > +                     break;
> > +             case IOOP_FSYNC:
> > +                     pending_counters->fsyncs++;
> > +                     cumulative_counters->fsyncs++;
> > +                     break;
> > +             case IOOP_WRITE:
> > +                     pending_counters->writes++;
> > +                     cumulative_counters->writes++;
> > +                     break;
> > +     }
> > +
> > +     have_ioopstats = true;
> > +}
>
> Doing two math ops / memory accesses every time seems off. Seems better
> to maintain cumultive_counters whenever reporting stats, just before
> zeroing pending_counters?
>

I've gone ahead and cut the cumulative counters concept.


>
>
> > +/*
> > + * Report IO operation statistics
> > + *
> > + * This works in much the same way as pgstat_flush_io_ops() but is
> meant for
> > + * BackendTypes like bgwriter for whom pgstat_report_stat() will not be
> called
> > + * frequently enough to keep shared memory stats fresh.
> > + * Backends not typically calling pgstat_report_stat() can invoke
> > + * pgstat_report_io_ops() explicitly.
> > + */
> > +void
> > +pgstat_report_io_ops(void)
> > +{
>
> This shouldn't be needed - the flush function above can be used.
>

Fixed.


>
>
> > +     PgStat_IOPathOps *dest_io_path_ops;
> > +     PgStatShared_BackendIOPathOps *stats_shmem;
> > +
> > +     PgBackendStatus *beentry = MyBEEntry;
> > +
> > +     Assert(!pgStatLocal.shmem->is_shutdown);
> > +     pgstat_assert_is_up();
> > +
> > +     if (!have_ioopstats)
> > +             return;
> > +
> > +     if (!beentry || beentry->st_backendType == B_INVALID)
> > +             return;
>
> Is there a case where this may be called where we have no beentry?
>
> Why not just use MyBackendType?
>

Fixed.


>
>
> > +     stats_shmem = &pgStatLocal.shmem->io_ops;
> > +
> > +     dest_io_path_ops =
> > +
>  &stats_shmem->stats[backend_type_get_idx(beentry->st_backendType)];
> > +
> > +     pgstat_begin_changecount_write(&stats_shmem->changecount);
>
> A mentioned before, the changecount stuff doesn't apply here. You need a
> lock.
>

Fixed.


>
>
> > +PgStat_IOPathOps *
> > +pgstat_fetch_backend_io_path_ops(void)
> > +{
> > +     pgstat_snapshot_fixed(PGSTAT_KIND_IOOPS);
> > +     return pgStatLocal.snapshot.io_path_ops;
> > +}
> > +
> > +PgStat_Counter
> > +pgstat_fetch_cumulative_io_ops(IOPath io_path, IOOp io_op)
> > +{
> > +     PgStat_IOOpCounters *counters =
> &cumulative_IOOpStats.data[io_path];
> > +
> > +     switch (io_op)
> > +     {
> > +             case IOOP_ALLOC:
> > +                     return counters->allocs;
> > +             case IOOP_EXTEND:
> > +                     return counters->extends;
> > +             case IOOP_FSYNC:
> > +                     return counters->fsyncs;
> > +             case IOOP_WRITE:
> > +                     return counters->writes;
> > +             default:
> > +                     elog(ERROR, "IO Operation %s for IO Path %s is
> undefined.",
> > +                                     pgstat_io_op_desc(io_op),
> pgstat_io_path_desc(io_path));
> > +     }
> > +}
>
> There's currently no user for this, right? Maybe let's just defer the
> cumulative stuff until we need it?
>

Removed.


>
>
> > +const char *
> > +pgstat_io_path_desc(IOPath io_path)
> > +{
> > +     const char *io_path_desc = "Unknown IO Path";
> > +
>
> This should be unreachable, right?
>

Changed it to an error.


>
>
> > From f2b5b75f5063702cbc3c64efdc1e7ef3cf1acdb4 Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplage...@gmail.com>
> > Date: Mon, 4 Jul 2022 15:44:17 -0400
> > Subject: [PATCH v22 3/3] Add system view tracking IO ops per backend type
>
> > Add pg_stat_buffers, a system view which tracks the number of IO
> > operations (allocs, writes, fsyncs, and extends) done through each IO
> > path (e.g. shared buffers, local buffers, unbuffered IO) by each type of
> > backend.
>
> I think I like pg_stat_io a bit better? Nearly everything in here seems
> to fit better in that.
>
> I guess we could split out buffers allocated, but that's actually
> interesting in the context of the kind of IO too.
>

changed it to pg_stat_io


>
> > +CREATE VIEW pg_stat_buffers AS
> > +SELECT
> > +       b.backend_type,
> > +       b.io_path,
> > +       b.alloc,
> > +       b.extend,
> > +       b.fsync,
> > +       b.write,
> > +       b.stats_reset
> > +FROM pg_stat_get_buffers() b;
>
> Do we want to expose all data to all users? I guess pg_stat_bgwriter
> does? But this does split things out a lot more...
>
>
I didn't see another similar example limiting access.


> >  DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2,
> trunc_stats_test3, trunc_stats_test4;
> >  DROP TABLE prevstats;
> > +SELECT pg_stat_reset_shared('buffers');
> > + pg_stat_reset_shared
> > +----------------------
> > +
> > +(1 row)
> > +
> > +SELECT pg_stat_force_next_flush();
> > + pg_stat_force_next_flush
> > +--------------------------
> > +
> > +(1 row)
> > +
> > +SELECT write = 0 FROM pg_stat_buffers WHERE io_path = 'Shared' and
> backend_type = 'checkpointer';
> > + ?column?
> > +----------
> > + t
> > +(1 row)
>
>
> Don't think you can rely on that. The lookup of the view, functions
> might have needed to load catalog data, which might have needed to evict
> buffers.  I think you can do something more reliable by checking that
> there's more written buffers after a checkpoint than before, or such.
>
>
Yes, per an off list suggestion by you, I have changed the tests to use a
sum of writes. I've also added a test for IOPATH_LOCAL and fixed some of
the missing calls to count IO Operations for IOPATH_LOCAL and
IOPATH_STRATEGY.

I struggled to come up with a way to test writes for a particular
type of backend are counted correctly since a dirty buffer could be
written out by another type of backend before the target BackendType has
a chance to write it out.

I also struggled to come up with a way to test IO operations for
background workers. I'm not sure of a way to deterministically have a
background worker do a particular kind of IO in a test scenario.

I'm not sure how to cause a strategy "extend" for testing.


>
> Would be nice to have something testing that the ringbuffer stats stuff
> does something sensible - that feels not entirely trivial.
>
>
I've added a test to test that reused strategy buffers are counted as
allocs. I would like to add a test which checks that if a buffer in the
ring is pinned and thus not reused, that it is not counted as a strategy
alloc, but I found it challenging without a way to pause vacuuming, pin
a buffer, then resume vacuuming.

Thanks,
Melanie
From 9d8fdbcf8dde109e84b680c8160c0174574a2c05 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Tue, 28 Jun 2022 11:33:04 -0400
Subject: [PATCH v23 1/3] Add BackendType for standalone backends

All backends should have a BackendType to enable statistics reporting
per BackendType.

Add a new BackendType for standalone backends, B_STANDALONE_BACKEND (and
alphabetize the BackendTypes). Both the bootstrap backend and single
user mode backends will have BackendType B_STANDALONE_BACKEND.

Author: Melanie Plageman <melanieplage...@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU%2BECH4tNwGFNERkZA%40mail.gmail.com
---
 src/backend/utils/init/miscinit.c | 17 +++++++++++------
 src/include/miscadmin.h           |  5 +++--
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index eb43b2c5e5..07e6db1a1c 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -176,6 +176,8 @@ InitStandaloneProcess(const char *argv0)
 {
 	Assert(!IsPostmasterEnvironment);
 
+	MyBackendType = B_STANDALONE_BACKEND;
+
 	/*
 	 * Start our win32 signal implementation
 	 */
@@ -255,6 +257,9 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_INVALID:
 			backendDesc = "not initialized";
 			break;
+		case B_ARCHIVER:
+			backendDesc = "archiver";
+			break;
 		case B_AUTOVAC_LAUNCHER:
 			backendDesc = "autovacuum launcher";
 			break;
@@ -273,6 +278,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_LOGGER:
+			backendDesc = "logger";
+			break;
+		case B_STANDALONE_BACKEND:
+			backendDesc = "standalone backend";
+			break;
 		case B_STARTUP:
 			backendDesc = "startup";
 			break;
@@ -285,12 +296,6 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_WAL_WRITER:
 			backendDesc = "walwriter";
 			break;
-		case B_ARCHIVER:
-			backendDesc = "archiver";
-			break;
-		case B_LOGGER:
-			backendDesc = "logger";
-			break;
 	}
 
 	return backendDesc;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index ea9a56d395..5276bf25a1 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -316,18 +316,19 @@ extern void SwitchBackToLocalLatch(void);
 typedef enum BackendType
 {
 	B_INVALID = 0,
+	B_ARCHIVER,
 	B_AUTOVAC_LAUNCHER,
 	B_AUTOVAC_WORKER,
 	B_BACKEND,
 	B_BG_WORKER,
 	B_BG_WRITER,
 	B_CHECKPOINTER,
+	B_LOGGER,
+	B_STANDALONE_BACKEND,
 	B_STARTUP,
 	B_WAL_RECEIVER,
 	B_WAL_SENDER,
 	B_WAL_WRITER,
-	B_ARCHIVER,
-	B_LOGGER,
 } BackendType;
 
 extern PGDLLIMPORT BackendType MyBackendType;
-- 
2.34.1

From b20d5fcc16492b1934d9a8cc8144508505db5d6b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 29 Jun 2022 18:37:42 -0400
Subject: [PATCH v23 2/3] Track IO operation statistics

Introduce "IOOp", an IO operation done by a backend, and "IOPath", the
location or type of IO done by a backend. For example, the checkpointer
may write a shared buffer out. This would be counted as an IOOp write on
an IOPath IOPATH_SHARED by BackendType "checkpointer".

Each IOOp (alloc, fsync, extend, write) is counted per IOPath
(direct, local, shared, or strategy) through a call to
pgstat_count_io_op().

The primary concern of these statistics is IO operations on data blocks
during the course of normal database operations. IO done by, for
example, the archiver or syslogger is not counted in these statistics.

IOPATH_LOCAL and IOPATH_SHARED IOPaths concern operations on local
and shared buffers.

The IOPATH_STRATEGY IOPath concerns buffers alloc'd/written/read/fsync'd
as part of a BufferAccessStrategy.

The IOPATH_DIRECT IOPath concerns blocks of IO which are read, written,
or fsync'd using smgrwrite/extend/immedsync directly (as opposed to
through [Local]BufferAlloc()).

Note that this commit does not add code to increment IOPATH_DIRECT. A
future patch adding wrappers for smgrwrite(), smgrextend(), and
smgrimmedsync() would provide a good location to call
pgstat_count_io_op() for unbuffered IO and avoid regressions for future
users of these functions.

IOOP_ALLOC is counted for IOPATH_SHARED and IOPATH_LOCAL whenever a
buffer is acquired through [Local]BufferAlloc(). IOOP_ALLOC is invalid
for IOPATH_DIRECT. IOOP_ALLOC for IOPATH_STRATEGY is counted whenever a
buffer already in the strategy ring is reused.

Stats on IOOps for all IOPaths for a backend are initially accumulated
locally.

Later they are flushed to shared memory and accumulated with those from
all other backends, exited and live.

Some BackendTypes will not execute pgstat_report_stat() and thus must
explicitly call pgstat_report_io_ops() in order to flush their backend
local IO operation statistics to shared memory.

Author: Melanie Plageman <melanieplage...@gmail.com>
Reviewed-by: Justin Pryzby <pry...@telsasoft.com>
Discussion: https://www.postgresql.org/message-id/flat/20200124195226.lth52iydq2n2uilq%40alap3.anarazel.de
---
 src/backend/bootstrap/bootstrap.c             |   1 +
 src/backend/postmaster/checkpointer.c         |   1 +
 src/backend/postmaster/walwriter.c            |   1 +
 src/backend/storage/buffer/bufmgr.c           |  53 ++++--
 src/backend/storage/buffer/freelist.c         |  33 +++-
 src/backend/storage/buffer/localbuf.c         |   5 +
 src/backend/storage/sync/sync.c               |   2 +
 src/backend/utils/activity/Makefile           |   1 +
 src/backend/utils/activity/pgstat.c           |  24 +++
 src/backend/utils/activity/pgstat_bgwriter.c  |   5 +
 .../utils/activity/pgstat_checkpointer.c      |   5 +
 src/backend/utils/activity/pgstat_database.c  |   5 +
 src/backend/utils/activity/pgstat_io_ops.c    | 168 ++++++++++++++++++
 src/backend/utils/activity/pgstat_relation.c  |  10 ++
 src/backend/utils/activity/pgstat_wal.c       |   5 +
 src/backend/utils/adt/pgstatfuncs.c           |   4 +-
 src/include/miscadmin.h                       |   2 +
 src/include/pgstat.h                          |  53 ++++++
 src/include/storage/buf_internals.h           |   4 +-
 src/include/utils/backend_status.h            |  34 ++++
 src/include/utils/pgstat_internal.h           |  17 ++
 21 files changed, 417 insertions(+), 16 deletions(-)
 create mode 100644 src/backend/utils/activity/pgstat_io_ops.c

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 088556ab54..963b05321e 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -33,6 +33,7 @@
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "pg_getopt.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/bufpage.h"
 #include "storage/condition_variable.h"
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 5fc076fc14..a06331e1eb 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -1116,6 +1116,7 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType type)
 		if (!AmBackgroundWriterProcess())
 			CheckpointerShmem->num_backend_fsync++;
 		LWLockRelease(CheckpointerCommLock);
+		pgstat_count_io_op(IOOP_FSYNC, IOPATH_SHARED);
 		return false;
 	}
 
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index e926f8c27c..64e58f17f6 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -301,6 +301,7 @@ HandleWalWriterInterrupts(void)
 		 * loop to avoid overloading the cumulative stats system, there may
 		 * exist unreported stats counters for the WAL writer.
 		 */
+		// TODO: This may not be needed also
 		pgstat_report_wal(true);
 
 		proc_exit(0);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e257ae23e4..be5fb1e5bf 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -482,7 +482,7 @@ static BufferDesc *BufferAlloc(SMgrRelation smgr,
 							   BlockNumber blockNum,
 							   BufferAccessStrategy strategy,
 							   bool *foundPtr);
-static void FlushBuffer(BufferDesc *buf, SMgrRelation reln);
+static void FlushBuffer(BufferDesc *buf, SMgrRelation reln, IOPath iopath);
 static void FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator,
 											 ForkNumber forkNum,
 											 BlockNumber nForkBlock,
@@ -980,6 +980,16 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 
 	if (isExtend)
 	{
+		IOPath io_path;
+
+		if (isLocalBuf)
+			io_path = IOPATH_LOCAL;
+		else if (strategy != NULL)
+			io_path = IOPATH_STRATEGY;
+		else
+			io_path = IOPATH_SHARED;
+
+		pgstat_count_io_op(IOOP_EXTEND, io_path);
 		/* new buffers are zero-filled */
 		MemSet((char *) bufBlock, 0, BLCKSZ);
 		/* don't set checksum for all-zero page */
@@ -1180,6 +1190,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	/* Loop here in case we have to try another victim buffer */
 	for (;;)
 	{
+		bool from_ring;
 		/*
 		 * Ensure, while the spinlock's not yet held, that there's a free
 		 * refcount entry.
@@ -1190,7 +1201,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		 * Select a victim buffer.  The buffer is returned with its header
 		 * spinlock still held!
 		 */
-		buf = StrategyGetBuffer(strategy, &buf_state);
+		buf = StrategyGetBuffer(strategy, &buf_state, &from_ring);
 
 		Assert(BUF_STATE_GET_REFCOUNT(buf_state) == 0);
 
@@ -1227,6 +1238,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			if (LWLockConditionalAcquire(BufferDescriptorGetContentLock(buf),
 										 LW_SHARED))
 			{
+				IOPath iopath;
 				/*
 				 * If using a nondefault strategy, and writing the buffer
 				 * would require a WAL flush, let the strategy decide whether
@@ -1244,7 +1256,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 					UnlockBufHdr(buf, buf_state);
 
 					if (XLogNeedsFlush(lsn) &&
-						StrategyRejectBuffer(strategy, buf))
+						StrategyRejectBuffer(strategy, buf, &from_ring))
 					{
 						/* Drop lock/pin and loop around for another buffer */
 						LWLockRelease(BufferDescriptorGetContentLock(buf));
@@ -1253,13 +1265,27 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 					}
 				}
 
+				/*
+				 * When a strategy is in use, if the dirty buffer was selected
+				 * from the strategy ring and we did not bother checking the
+				 * freelist or doing a clock sweep to look for a clean shared
+				 * buffer to use, the write will be counted as a strategy
+				 * write. However, if the dirty buffer was obtained from the
+				 * freelist or a clock sweep, it is counted as a regular write.
+				 *
+				 * When a strategy is not in use, at this point the write can
+				 * only be a "regular" write of a dirty buffer.
+				 */
+
+				iopath = from_ring ? IOPATH_STRATEGY : IOPATH_SHARED;
+
 				/* OK, do the I/O */
 				TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
 														  smgr->smgr_rlocator.locator.spcOid,
 														  smgr->smgr_rlocator.locator.dbOid,
 														  smgr->smgr_rlocator.locator.relNumber);
 
-				FlushBuffer(buf, NULL);
+				FlushBuffer(buf, NULL, iopath);
 				LWLockRelease(BufferDescriptorGetContentLock(buf));
 
 				ScheduleBufferTagForWriteback(&BackendWritebackContext,
@@ -2563,7 +2589,7 @@ SyncOneBuffer(int buf_id, bool skip_recently_used, WritebackContext *wb_context)
 	PinBuffer_Locked(bufHdr);
 	LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
 
-	FlushBuffer(bufHdr, NULL);
+	FlushBuffer(bufHdr, NULL, IOPATH_SHARED);
 
 	LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
 
@@ -2810,9 +2836,12 @@ BufferGetTag(Buffer buffer, RelFileLocator *rlocator, ForkNumber *forknum,
  *
  * If the caller has an smgr reference for the buffer's relation, pass it
  * as the second parameter.  If not, pass NULL.
+ *
+ * IOPath will always be IOPATH_SHARED except when a buffer access strategy is
+ * used and the buffer being flushed is a buffer from the strategy ring.
  */
 static void
-FlushBuffer(BufferDesc *buf, SMgrRelation reln)
+FlushBuffer(BufferDesc *buf, SMgrRelation reln, IOPath iopath)
 {
 	XLogRecPtr	recptr;
 	ErrorContextCallback errcallback;
@@ -2892,6 +2921,8 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
 	 */
 	bufToWrite = PageSetChecksumCopy((Page) bufBlock, buf->tag.blockNum);
 
+	pgstat_count_io_op(IOOP_WRITE, iopath);
+
 	if (track_io_timing)
 		INSTR_TIME_SET_CURRENT(io_start);
 
@@ -3540,6 +3571,8 @@ FlushRelationBuffers(Relation rel)
 						  localpage,
 						  false);
 
+				pgstat_count_io_op(IOOP_WRITE, IOPATH_LOCAL);
+
 				buf_state &= ~(BM_DIRTY | BM_JUST_DIRTIED);
 				pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
 
@@ -3575,7 +3608,7 @@ FlushRelationBuffers(Relation rel)
 		{
 			PinBuffer_Locked(bufHdr);
 			LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
-			FlushBuffer(bufHdr, RelationGetSmgr(rel));
+			FlushBuffer(bufHdr, RelationGetSmgr(rel), IOPATH_SHARED);
 			LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
 			UnpinBuffer(bufHdr, true);
 		}
@@ -3670,7 +3703,7 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
 		{
 			PinBuffer_Locked(bufHdr);
 			LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
-			FlushBuffer(bufHdr, srelent->srel);
+			FlushBuffer(bufHdr, srelent->srel, IOPATH_SHARED);
 			LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
 			UnpinBuffer(bufHdr, true);
 		}
@@ -3878,7 +3911,7 @@ FlushDatabaseBuffers(Oid dbid)
 		{
 			PinBuffer_Locked(bufHdr);
 			LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
-			FlushBuffer(bufHdr, NULL);
+			FlushBuffer(bufHdr, NULL, IOPATH_SHARED);
 			LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
 			UnpinBuffer(bufHdr, true);
 		}
@@ -3905,7 +3938,7 @@ FlushOneBuffer(Buffer buffer)
 
 	Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr)));
 
-	FlushBuffer(bufHdr, NULL);
+	FlushBuffer(bufHdr, NULL, IOPATH_SHARED);
 }
 
 /*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index 990e081aae..e042612c4a 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -15,6 +15,7 @@
  */
 #include "postgres.h"
 
+#include "pgstat.h"
 #include "port/atomics.h"
 #include "storage/buf_internals.h"
 #include "storage/bufmgr.h"
@@ -198,7 +199,7 @@ have_free_buffer(void)
  *	return the buffer with the buffer header spinlock still held.
  */
 BufferDesc *
-StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
+StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state, bool *from_ring)
 {
 	BufferDesc *buf;
 	int			bgwprocno;
@@ -212,8 +213,19 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
 	if (strategy != NULL)
 	{
 		buf = GetBufferFromRing(strategy, buf_state);
-		if (buf != NULL)
+		*from_ring = buf != NULL;
+		if (*from_ring)
+		{
+			/*
+			 * When a strategy is in use, reused buffers from the strategy ring
+			 * will be counted as allocations for the purposes of IO Operation
+			 * statistics tracking. However, even when a strategy is in use, if
+			 * a new buffer must be allocated from shared buffers and added to
+			 * the ring, this is counted as a IOPATH_SHARED allocation.
+			 */
+			pgstat_count_io_op(IOOP_ALLOC, IOPATH_STRATEGY);
 			return buf;
+		}
 	}
 
 	/*
@@ -247,6 +259,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
 	 * the rate of buffer consumption.  Note that buffers recycled by a
 	 * strategy object are intentionally not counted here.
 	 */
+	pgstat_count_io_op(IOOP_ALLOC, IOPATH_SHARED);
 	pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);
 
 	/*
@@ -682,8 +695,15 @@ AddBufferToRing(BufferAccessStrategy strategy, BufferDesc *buf)
  * if this buffer should be written and re-used.
  */
 bool
-StrategyRejectBuffer(BufferAccessStrategy strategy, BufferDesc *buf)
+StrategyRejectBuffer(BufferAccessStrategy strategy, BufferDesc *buf, bool *from_ring)
 {
+
+	/*
+	 * Start by assuming that we will use the dirty buffer selected by
+	 * StrategyGetBuffer().
+	 */
+	*from_ring = true;
+
 	/* We only do this in bulkread mode */
 	if (strategy->btype != BAS_BULKREAD)
 		return false;
@@ -699,5 +719,12 @@ StrategyRejectBuffer(BufferAccessStrategy strategy, BufferDesc *buf)
 	 */
 	strategy->buffers[strategy->current] = InvalidBuffer;
 
+	/*
+	 * Since we will not be writing out a dirty buffer from the ring, set
+	 * from_ring to false so that the caller does not count this write as a
+	 * "strategy write" and can do proper bookkeeping.
+	 */
+	*from_ring = false;
+
 	return true;
 }
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 41a08076b3..e99e1f53ef 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -15,6 +15,7 @@
  */
 #include "postgres.h"
 
+#include "pgstat.h"
 #include "access/parallel.h"
 #include "catalog/catalog.h"
 #include "executor/instrument.h"
@@ -123,6 +124,8 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
 	if (LocalBufHash == NULL)
 		InitLocalBuffers();
 
+	pgstat_count_io_op(IOOP_ALLOC, IOPATH_LOCAL);
+
 	/* See if the desired buffer already exists */
 	hresult = (LocalBufferLookupEnt *)
 		hash_search(LocalBufHash, (void *) &newTag, HASH_FIND, NULL);
@@ -226,6 +229,8 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
 				  localpage,
 				  false);
 
+		pgstat_count_io_op(IOOP_WRITE, IOPATH_LOCAL);
+
 		/* Mark not-dirty now in case we error out below */
 		buf_state &= ~BM_DIRTY;
 		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index e1fb631003..20e259edef 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -432,6 +432,8 @@ ProcessSyncRequests(void)
 					total_elapsed += elapsed;
 					processed++;
 
+					pgstat_count_io_op(IOOP_FSYNC, IOPATH_SHARED);
+
 					if (log_checkpoints)
 						elog(DEBUG1, "checkpoint sync: number=%d file=%s time=%.3f ms",
 							 processed,
diff --git a/src/backend/utils/activity/Makefile b/src/backend/utils/activity/Makefile
index a2e8507fd6..0098785089 100644
--- a/src/backend/utils/activity/Makefile
+++ b/src/backend/utils/activity/Makefile
@@ -22,6 +22,7 @@ OBJS = \
 	pgstat_checkpointer.o \
 	pgstat_database.o \
 	pgstat_function.o \
+	pgstat_io_ops.o \
 	pgstat_relation.o \
 	pgstat_replslot.o \
 	pgstat_shmem.o \
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 88e5dd1b2b..52924e64dd 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -359,6 +359,15 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.snapshot_cb = pgstat_checkpointer_snapshot_cb,
 	},
 
+	[PGSTAT_KIND_IOOPS] = {
+		.name = "io_ops",
+
+		.fixed_amount = true,
+
+		.reset_all_cb = pgstat_io_ops_reset_all_cb,
+		.snapshot_cb = pgstat_io_ops_snapshot_cb,
+	},
+
 	[PGSTAT_KIND_SLRU] = {
 		.name = "slru",
 
@@ -628,6 +637,9 @@ pgstat_report_stat(bool force)
 	/* flush database / relation / function / ... stats */
 	partial_flush |= pgstat_flush_pending_entries(nowait);
 
+	/* flush IO Operations stats */
+	partial_flush |= pgstat_flush_io_ops(nowait);
+
 	/* flush wal stats */
 	partial_flush |= pgstat_flush_wal(nowait);
 
@@ -1312,6 +1324,12 @@ pgstat_write_statsfile(void)
 	pgstat_build_snapshot_fixed(PGSTAT_KIND_CHECKPOINTER);
 	write_chunk_s(fpout, &pgStatLocal.snapshot.checkpointer);
 
+	/*
+	 * Write IO Operations stats struct
+	 */
+	pgstat_build_snapshot_fixed(PGSTAT_KIND_IOOPS);
+	write_chunk_s(fpout, &pgStatLocal.snapshot.io_path_ops);
+
 	/*
 	 * Write SLRU stats struct
 	 */
@@ -1486,6 +1504,12 @@ pgstat_read_statsfile(void)
 	if (!read_chunk_s(fpin, &shmem->checkpointer.stats))
 		goto error;
 
+	/*
+	 * Read IO Operations stats struct
+	 */
+	if (!read_chunk_s(fpin, &shmem->io_ops.stats))
+		goto error;
+
 	/*
 	 * Read SLRU stats struct
 	 */
diff --git a/src/backend/utils/activity/pgstat_bgwriter.c b/src/backend/utils/activity/pgstat_bgwriter.c
index fbb1edc527..d83df169db 100644
--- a/src/backend/utils/activity/pgstat_bgwriter.c
+++ b/src/backend/utils/activity/pgstat_bgwriter.c
@@ -56,6 +56,11 @@ pgstat_report_bgwriter(void)
 	 * Clear out the statistics buffer, so it can be re-used.
 	 */
 	MemSet(&PendingBgWriterStats, 0, sizeof(PendingBgWriterStats));
+
+	/*
+	 * Also report IO Operations statistics
+	 */
+	pgstat_flush_io_ops(false);
 }
 
 /*
diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c
index af8d513e7b..668abecf90 100644
--- a/src/backend/utils/activity/pgstat_checkpointer.c
+++ b/src/backend/utils/activity/pgstat_checkpointer.c
@@ -62,6 +62,11 @@ pgstat_report_checkpointer(void)
 	 * Clear out the statistics buffer, so it can be re-used.
 	 */
 	MemSet(&PendingCheckpointerStats, 0, sizeof(PendingCheckpointerStats));
+
+	/*
+	 * Also report IO Operation statistics
+	 */
+	pgstat_flush_io_ops(false);
 }
 
 /*
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index d9275611f0..5fac75c8c6 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -72,6 +72,11 @@ pgstat_report_autovac(Oid dboid)
 	dbentry->stats.last_autovac_time = GetCurrentTimestamp();
 
 	pgstat_unlock_entry(entry_ref);
+
+	/*
+	 * Also report IO Operation statistics
+	 */
+	pgstat_flush_io_ops(false);
 }
 
 /*
diff --git a/src/backend/utils/activity/pgstat_io_ops.c b/src/backend/utils/activity/pgstat_io_ops.c
new file mode 100644
index 0000000000..abe288efc4
--- /dev/null
+++ b/src/backend/utils/activity/pgstat_io_ops.c
@@ -0,0 +1,168 @@
+/* -------------------------------------------------------------------------
+ *
+ * pgstat_io_ops.c
+ *	  Implementation of IO operation statistics.
+ *
+ * This file contains the implementation of IO operation statistics. It is kept
+ * separate from pgstat.c to enforce the line between the statistics access /
+ * storage implementation and the details about individual types of
+ * statistics.
+ *
+ * Copyright (c) 2001-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/activity/pgstat_io_ops.c
+ * -------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/pgstat_internal.h"
+
+static PgStat_IOPathOps pending_IOOpStats;
+bool have_ioopstats = false;
+
+
+/*
+ * Flush out locally pending IO Operation statistics entries
+ *
+ * If nowait is true, this function returns false on lock failure. Otherwise
+ * this function always returns true.
+ *
+ * If nowait is true, this function returns true if the lock could not be
+ * acquired. Otherwise return false.
+ */
+bool
+pgstat_flush_io_ops(bool nowait)
+{
+	PgStat_IOPathOps *stats_shmem;
+
+	if (!have_ioopstats)
+		return false;
+
+	stats_shmem =
+		&pgStatLocal.shmem->io_ops.stats[backend_type_get_idx(MyBackendType)];
+
+	if (!nowait)
+		LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+	else if (!LWLockConditionalAcquire(&stats_shmem->lock, LW_EXCLUSIVE))
+		return true;
+
+
+	for (int i = 0; i < IOPATH_NUM_TYPES; i++)
+	{
+		PgStat_IOOpCounters *sharedent = &stats_shmem->data[i];
+		PgStat_IOOpCounters *pendingent = &pending_IOOpStats.data[i];
+
+#define IO_OP_ACC(fld) sharedent->fld += pendingent->fld
+		IO_OP_ACC(allocs);
+		IO_OP_ACC(extends);
+		IO_OP_ACC(fsyncs);
+		IO_OP_ACC(writes);
+#undef IO_OP_ACC
+	}
+
+	LWLockRelease(&stats_shmem->lock);
+
+	MemSet(&pending_IOOpStats, 0, sizeof(pending_IOOpStats));
+
+	have_ioopstats = false;
+
+	return false;
+}
+
+void
+pgstat_io_ops_snapshot_cb(void)
+{
+	PgStatShared_BackendIOPathOps *stats_shmem = &pgStatLocal.shmem->io_ops;
+
+	LWLockAcquire(&stats_shmem->lock, LW_SHARED);
+
+	memcpy(pgStatLocal.snapshot.io_path_ops, &stats_shmem->stats,
+			sizeof(stats_shmem->stats));
+
+	LWLockRelease(&stats_shmem->lock);
+}
+
+void
+pgstat_io_ops_reset_all_cb(TimestampTz ts)
+{
+	PgStatShared_BackendIOPathOps *stats_shmem = &pgStatLocal.shmem->io_ops;
+
+	LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+
+	memset(&stats_shmem->stats, 0, sizeof(stats_shmem->stats));
+
+	for (int i = 0; i < BACKEND_NUM_TYPES; i++)
+		stats_shmem->stats[i].stat_reset_timestamp = ts;
+
+	LWLockRelease(&stats_shmem->lock);
+}
+
+void
+pgstat_count_io_op(IOOp io_op, IOPath io_path)
+{
+	PgStat_IOOpCounters *pending_counters = &pending_IOOpStats.data[io_path];
+
+	switch (io_op)
+	{
+		case IOOP_ALLOC:
+			pending_counters->allocs++;
+			break;
+		case IOOP_EXTEND:
+			pending_counters->extends++;
+			break;
+		case IOOP_FSYNC:
+			pending_counters->fsyncs++;
+			break;
+		case IOOP_WRITE:
+			pending_counters->writes++;
+			break;
+	}
+
+	have_ioopstats = true;
+}
+
+PgStat_IOPathOps *
+pgstat_fetch_backend_io_path_ops(void)
+{
+	pgstat_snapshot_fixed(PGSTAT_KIND_IOOPS);
+
+	return pgStatLocal.snapshot.io_path_ops;
+}
+
+const char *
+pgstat_io_path_desc(IOPath io_path)
+{
+	switch (io_path)
+	{
+		case IOPATH_DIRECT:
+			return "Direct";
+		case IOPATH_LOCAL:
+			return "Local";
+		case IOPATH_SHARED:
+			return "Shared";
+		case IOPATH_STRATEGY:
+			return "Strategy";
+	}
+
+	elog(ERROR, "Attempt to describe an unknown IOPath");
+}
+
+const char *
+pgstat_io_op_desc(IOOp io_op)
+{
+	switch (io_op)
+	{
+		case IOOP_ALLOC:
+			return "Alloc";
+		case IOOP_EXTEND:
+			return "Extend";
+		case IOOP_FSYNC:
+			return "Fsync";
+		case IOOP_WRITE:
+			return "Write";
+	}
+
+	elog(ERROR, "Attempt to describe an unknown IOOperation");
+}
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index a846d9ffb6..01ea45adf4 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -257,6 +257,11 @@ pgstat_report_vacuum(Oid tableoid, bool shared,
 	}
 
 	pgstat_unlock_entry(entry_ref);
+
+	/*
+	 * Also report IO Operations statistics
+	 */
+	pgstat_flush_io_ops(false);
 }
 
 /*
@@ -340,6 +345,11 @@ pgstat_report_analyze(Relation rel,
 	}
 
 	pgstat_unlock_entry(entry_ref);
+
+	/*
+	 * Also report IO Operations statistics
+	 */
+	pgstat_flush_io_ops(false);
 }
 
 /*
diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 5a878bd115..29b2e63e16 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -43,6 +43,11 @@ void
 pgstat_report_wal(bool force)
 {
 	pgstat_flush_wal(force);
+
+	/*
+	 * Also report IO Operation statistics
+	 */
+	pgstat_flush_io_ops(false);
 }
 
 /*
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 893690dad5..6259cc4f4c 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2104,6 +2104,8 @@ pg_stat_reset_shared(PG_FUNCTION_ARGS)
 		pgstat_reset_of_kind(PGSTAT_KIND_BGWRITER);
 		pgstat_reset_of_kind(PGSTAT_KIND_CHECKPOINTER);
 	}
+	else if (strcmp(target, "io") == 0)
+		pgstat_reset_of_kind(PGSTAT_KIND_IOOPS);
 	else if (strcmp(target, "recovery_prefetch") == 0)
 		XLogPrefetchResetStats();
 	else if (strcmp(target, "wal") == 0)
@@ -2112,7 +2114,7 @@ pg_stat_reset_shared(PG_FUNCTION_ARGS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("unrecognized reset target: \"%s\"", target),
-				 errhint("Target must be \"archiver\", \"bgwriter\", \"recovery_prefetch\", or \"wal\".")));
+				 errhint("Target must be \"archiver\", \"io\", \"bgwriter\", \"recovery_prefetch\", or \"wal\".")));
 
 	PG_RETURN_VOID();
 }
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 5276bf25a1..61e95135f2 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -331,6 +331,8 @@ typedef enum BackendType
 	B_WAL_WRITER,
 } BackendType;
 
+#define BACKEND_NUM_TYPES B_WAL_WRITER
+
 extern PGDLLIMPORT BackendType MyBackendType;
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ac28f813b4..36a4b89a58 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
 #include "datatype/timestamp.h"
 #include "portability/instr_time.h"
 #include "postmaster/pgarch.h"	/* for MAX_XFN_CHARS */
+#include "storage/lwlock.h"
 #include "utils/backend_progress.h" /* for backward compatibility */
 #include "utils/backend_status.h"	/* for backward compatibility */
 #include "utils/relcache.h"
@@ -48,6 +49,7 @@ typedef enum PgStat_Kind
 	PGSTAT_KIND_ARCHIVER,
 	PGSTAT_KIND_BGWRITER,
 	PGSTAT_KIND_CHECKPOINTER,
+	PGSTAT_KIND_IOOPS,
 	PGSTAT_KIND_SLRU,
 	PGSTAT_KIND_WAL,
 } PgStat_Kind;
@@ -276,6 +278,45 @@ typedef struct PgStat_CheckpointerStats
 	PgStat_Counter buf_fsync_backend;
 } PgStat_CheckpointerStats;
 
+/*
+ * Types related to counting IO Operations for various IO Paths
+ */
+
+typedef enum IOOp
+{
+	IOOP_ALLOC,
+	IOOP_EXTEND,
+	IOOP_FSYNC,
+	IOOP_WRITE,
+} IOOp;
+
+#define IOOP_NUM_TYPES (IOOP_WRITE + 1)
+
+typedef enum IOPath
+{
+	IOPATH_DIRECT,
+	IOPATH_LOCAL,
+	IOPATH_SHARED,
+	IOPATH_STRATEGY,
+} IOPath;
+
+#define IOPATH_NUM_TYPES (IOPATH_STRATEGY + 1)
+
+typedef struct PgStat_IOOpCounters
+{
+	PgStat_Counter allocs;
+	PgStat_Counter extends;
+	PgStat_Counter fsyncs;
+	PgStat_Counter writes;
+} PgStat_IOOpCounters;
+
+typedef struct PgStat_IOPathOps
+{
+	LWLock		lock;
+	PgStat_IOOpCounters data[IOPATH_NUM_TYPES];
+	TimestampTz stat_reset_timestamp;
+} PgStat_IOPathOps;
+
 typedef struct PgStat_StatDBEntry
 {
 	PgStat_Counter n_xact_commit;
@@ -453,6 +494,18 @@ extern void pgstat_report_checkpointer(void);
 extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
 
 
+/*
+ * Functions in pgstat_io_ops.c
+ */
+
+extern void pgstat_count_io_op(IOOp io_op, IOPath io_path);
+extern bool pgstat_flush_io_ops(bool nowait);
+extern PgStat_IOPathOps *pgstat_fetch_backend_io_path_ops(void);
+extern PgStat_Counter pgstat_fetch_cumulative_io_ops(IOPath io_path, IOOp io_op);
+extern const char *pgstat_io_op_desc(IOOp io_op);
+extern const char *pgstat_io_path_desc(IOPath io_path);
+
+
 /*
  * Functions in pgstat_database.c
  */
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index aded5e8f7e..e35d82f050 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -310,10 +310,10 @@ extern void ScheduleBufferTagForWriteback(WritebackContext *context, BufferTag *
 
 /* freelist.c */
 extern BufferDesc *StrategyGetBuffer(BufferAccessStrategy strategy,
-									 uint32 *buf_state);
+									 uint32 *buf_state, bool *from_ring);
 extern void StrategyFreeBuffer(BufferDesc *buf);
 extern bool StrategyRejectBuffer(BufferAccessStrategy strategy,
-								 BufferDesc *buf);
+								 BufferDesc *buf, bool *from_ring);
 
 extern int	StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc);
 extern void StrategyNotifyBgWriter(int bgwprocno);
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 7403bca25e..49d062b1af 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -306,6 +306,40 @@ extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
 													   int buflen);
 extern uint64 pgstat_get_my_query_id(void);
 
+/* Utility functions */
+
+/*
+ * When maintaining an array of information about all valid BackendTypes, in
+ * order to avoid wasting the 0th spot, use this helper to convert a valid
+ * BackendType to a valid location in the array (given that no spot is
+ * maintained for B_INVALID BackendType).
+ */
+static inline int backend_type_get_idx(BackendType backend_type)
+{
+	/*
+	 * backend_type must be one of the valid backend types. If caller is
+	 * maintaining backend information in an array that includes B_INVALID,
+	 * this function is unnecessary.
+	 */
+	Assert(backend_type > B_INVALID && backend_type <= BACKEND_NUM_TYPES);
+	return backend_type - 1;
+}
+
+/*
+ * When using a value from an array of information about all valid
+ * BackendTypes, add 1 to the index before using it as a BackendType to adjust
+ * for not maintaining a spot for B_INVALID BackendType.
+ */
+static inline BackendType idx_get_backend_type(int idx)
+{
+	int backend_type = idx + 1;
+	/*
+	 * If the array includes a spot for B_INVALID BackendType this function is
+	 * not required.
+	 */
+	Assert(backend_type > B_INVALID && backend_type <= BACKEND_NUM_TYPES);
+	return backend_type;
+}
 
 /* ----------
  * Support functions for the SQL-callable functions to
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9303d05427..adffdd147d 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -329,6 +329,12 @@ typedef struct PgStatShared_Checkpointer
 	PgStat_CheckpointerStats reset_offset;
 } PgStatShared_Checkpointer;
 
+typedef struct PgStatShared_BackendIOPathOps
+{
+	LWLock lock;
+	PgStat_IOPathOps stats[BACKEND_NUM_TYPES];
+} PgStatShared_BackendIOPathOps;
+
 typedef struct PgStatShared_SLRU
 {
 	/* lock protects ->stats */
@@ -419,6 +425,7 @@ typedef struct PgStat_ShmemControl
 	PgStatShared_Archiver archiver;
 	PgStatShared_BgWriter bgwriter;
 	PgStatShared_Checkpointer checkpointer;
+	PgStatShared_BackendIOPathOps io_ops;
 	PgStatShared_SLRU slru;
 	PgStatShared_Wal wal;
 } PgStat_ShmemControl;
@@ -442,6 +449,8 @@ typedef struct PgStat_Snapshot
 
 	PgStat_CheckpointerStats checkpointer;
 
+	PgStat_IOPathOps io_path_ops[BACKEND_NUM_TYPES];
+
 	PgStat_SLRUStats slru[SLRU_NUM_ELEMENTS];
 
 	PgStat_WalStats wal;
@@ -549,6 +558,14 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
 extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
 
 
+/*
+ * Functions in pgstat_io_ops.c
+ */
+
+extern void pgstat_io_ops_snapshot_cb(void);
+extern void pgstat_io_ops_reset_all_cb(TimestampTz ts);
+
+
 /*
  * Functions in pgstat_relation.c
  */
-- 
2.34.1

From 714fab745590b4ed6c1b9e220fb75c36ad5ab85d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Mon, 4 Jul 2022 15:44:17 -0400
Subject: [PATCH v23 3/3] Add system view tracking IO ops per backend type

Add pg_stat_io, a system view which tracks the number of IOOp (allocs,
writes, fsyncs, and extends) done through each IOPath (e.g. shared
buffers, local buffers, unbuffered IO) by each type of backend.

Some of these should always be zero. For example, checkpointer does not
use a BufferAccessStrategy (currently), so the "strategy" IOPath for
checkpointer will be 0 for all IOOps. All possible combinations of
IOPath and IOOp are enumerated in the view but not all are populated or
even possible at this point.

View stats are fetched from statistics incremented when a backend
performs an IO Operation and maintained by the cumulative statistics
subsystem.

Each row of the view is stats for a particular BackendType for a
particular IOPath (e.g. shared buffer accesses by checkpointer) and
each column in the view is the total number of IO Operations done (e.g.
writes).
So a cell in the view would be, for example, the number of shared
buffers written by checkpointer since the last stats reset.

Note that some of the cells in the view are redundant with fields in
pg_stat_bgwriter (e.g. buffers_backend), however these have been kept in
pg_stat_bgwriter for backwards compatibility. Deriving the redundant
pg_stat_bgwriter stats from the IO operations stats structures was also
problematic due to the separate reset targets for 'bgwriter' and
'io'.

Suggested by Andres Freund

Author: Melanie Plageman <melanieplage...@gmail.com>
Reviewed-by: Justin Pryzby <pry...@telsasoft.com>
Discussion: https://www.postgresql.org/message-id/flat/20200124195226.lth52iydq2n2uilq%40alap3.anarazel.de
---
 doc/src/sgml/monitoring.sgml         | 108 ++++++++++++++++++++++++++-
 src/backend/catalog/system_views.sql |  11 +++
 src/backend/utils/adt/pgstatfuncs.c  |  66 ++++++++++++++++
 src/include/catalog/pg_proc.dat      |   9 +++
 src/test/regress/expected/rules.out  |   8 ++
 src/test/regress/expected/stats.out  |  59 +++++++++++++++
 src/test/regress/sql/stats.sql       |  34 +++++++++
 7 files changed, 294 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4549c2560e..775ecf2f21 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -448,6 +448,15 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_io</structname><indexterm><primary>pg_stat_io</primary></indexterm></entry>
+      <entry>A row for each IO path for each backend type showing
+      statistics about backend IO operations. See
+       <link linkend="monitoring-pg-stat-io-view">
+       <structname>pg_stat_io</structname></link> for details.
+     </entry>
+     </row>
+
      <row>
       <entry><structname>pg_stat_wal</structname><indexterm><primary>pg_stat_wal</primary></indexterm></entry>
       <entry>One row only, showing statistics about WAL activity. See
@@ -3595,7 +3604,102 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
        <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
       </para>
       <para>
-       Time at which these statistics were last reset
+       Time at which these statistics were last reset.
+      </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
+
+ <sect2 id="monitoring-pg-stat-io-view">
+  <title><structname>pg_stat_io</structname></title>
+
+  <indexterm>
+   <primary>pg_stat_io</primary>
+  </indexterm>
+
+  <para>
+   The <structname>pg_stat_io</structname> view has a row for each backend
+   type for each possible IO path containing global data for the cluster for
+   that backend and IO path.
+  </para>
+
+  <table id="pg-stat-io-view" xreflabel="pg_stat_io">
+   <title><structname>pg_stat_io</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>backend_type</structfield> <type>text</type>
+      </para>
+      <para>
+       Type of backend (e.g. background worker, autovacuum worker).
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_path</structfield> <type>text</type>
+      </para>
+      <para>
+       IO path taken (e.g. shared buffers, direct).
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>alloc</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of buffers allocated.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>extend</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of blocks extended.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>fsync</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of blocks fsynced.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>write</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of blocks written.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+      </para>
+      <para>
+       Time at which these statistics were last reset.
       </para></entry>
      </row>
     </tbody>
@@ -5355,6 +5459,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
         the <structname>pg_stat_bgwriter</structname>
         view, <literal>archiver</literal> to reset all the counters shown in
         the <structname>pg_stat_archiver</structname> view,
+        <literal>io</literal> to reset all the counters shown in the
+        <structname>pg_stat_io</structname> view,
         <literal>wal</literal> to reset all the counters shown in the
         <structname>pg_stat_wal</structname> view or
         <literal>recovery_prefetch</literal> to reset all the counters shown
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fedaed533b..b0b2d39e28 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1115,6 +1115,17 @@ CREATE VIEW pg_stat_bgwriter AS
         pg_stat_get_buf_alloc() AS buffers_alloc,
         pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;
 
+CREATE VIEW pg_stat_io AS
+SELECT
+       b.backend_type,
+       b.io_path,
+       b.alloc,
+       b.extend,
+       b.fsync,
+       b.write,
+       b.stats_reset
+FROM pg_stat_get_io() b;
+
 CREATE VIEW pg_stat_wal AS
     SELECT
         w.wal_records,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 6259cc4f4c..30aff64860 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -1739,6 +1739,72 @@ pg_stat_get_buf_alloc(PG_FUNCTION_ARGS)
 	PG_RETURN_INT64(pgstat_fetch_stat_bgwriter()->buf_alloc);
 }
 
+/*
+* When adding a new column to the pg_stat_io view, add a new enum
+* value here above IO_NUM_COLUMNS.
+*/
+enum
+{
+	IO_COLUMN_BACKEND_TYPE,
+	IO_COLUMN_IO_PATH,
+	IO_COLUMN_ALLOCS,
+	IO_COLUMN_EXTENDS,
+	IO_COLUMN_FSYNCS,
+	IO_COLUMN_WRITES,
+	IO_COLUMN_RESET_TIME,
+	IO_NUM_COLUMNS,
+};
+
+Datum
+pg_stat_get_io(PG_FUNCTION_ARGS)
+{
+	PgStat_IOPathOps *io_path_ops;
+	ReturnSetInfo *rsinfo;
+	Datum reset_time;
+
+	SetSingleFuncCall(fcinfo, 0);
+	io_path_ops = pgstat_fetch_backend_io_path_ops();
+	rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+
+	/*
+		* Currently it is not permitted to reset IO operation stats for individual
+		* IO Paths or individual BackendTypes. All IO Operation statistics are
+		* reset together. As such, it is easiest to reuse the first reset timestamp
+		* available.
+		*/
+	reset_time = TimestampTzGetDatum(io_path_ops->stat_reset_timestamp);
+
+	for (int i = 0; i < BACKEND_NUM_TYPES; i++)
+	{
+		PgStat_IOOpCounters *counters = io_path_ops->data;
+		Datum		backend_type_desc =
+			CStringGetTextDatum(GetBackendTypeDesc(idx_get_backend_type(i)));
+			/* const char *log_name = GetBackendTypeDesc(idx_get_backend_type(i)); */
+
+		for (int j = 0; j < IOPATH_NUM_TYPES; j++)
+		{
+			Datum values[IO_NUM_COLUMNS];
+			bool nulls[IO_NUM_COLUMNS];
+			memset(values, 0, sizeof(values));
+			memset(nulls, 0, sizeof(nulls));
+
+			values[IO_COLUMN_BACKEND_TYPE] = backend_type_desc;
+			values[IO_COLUMN_IO_PATH] = CStringGetTextDatum(pgstat_io_path_desc(j));
+			values[IO_COLUMN_RESET_TIME] = TimestampTzGetDatum(reset_time);
+			values[IO_COLUMN_ALLOCS] = Int64GetDatum(counters->allocs);
+			values[IO_COLUMN_EXTENDS] = Int64GetDatum(counters->extends);
+			values[IO_COLUMN_FSYNCS] = Int64GetDatum(counters->fsyncs);
+			values[IO_COLUMN_WRITES] = Int64GetDatum(counters->writes);
+
+			tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
+			counters++;
+		}
+		io_path_ops++;
+	}
+
+	return (Datum) 0;
+}
+
 /*
  * Returns statistics of WAL activity
  */
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d9e8..e9662fdc04 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5646,6 +5646,15 @@
   proname => 'pg_stat_get_buf_alloc', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => '', prosrc => 'pg_stat_get_buf_alloc' },
 
+{ oid => '8459', descr => 'statistics: counts of all IO operations done through all IO paths by each type of backend.',
+  proname => 'pg_stat_get_io', provolatile => 's', proisstrict => 'f',
+  prorows => '52', proretset => 't',
+  proparallel => 'r', prorettype => 'record', proargtypes => '',
+  proallargtypes => '{text,text,int8,int8,int8,int8,timestamptz}',
+  proargmodes => '{o,o,o,o,o,o,o}',
+  proargnames => '{backend_type,io_path,alloc,extend,fsync,write,stats_reset}',
+  prosrc => 'pg_stat_get_io' },
+
 { oid => '1136', descr => 'statistics: information about WAL activity',
   proname => 'pg_stat_get_wal', proisstrict => 'f', provolatile => 's',
   proparallel => 'r', prorettype => 'record', proargtypes => '',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7ec3d2688f..3b05af9ac8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1873,6 +1873,14 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_enc AS encrypted
    FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
   WHERE (s.client_port IS NOT NULL);
+pg_stat_io| SELECT b.backend_type,
+    b.io_path,
+    b.alloc,
+    b.extend,
+    b.fsync,
+    b.write,
+    b.stats_reset
+   FROM pg_stat_get_io() b(backend_type, io_path, alloc, extend, fsync, write, stats_reset);
 pg_stat_progress_analyze| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 5b0ebf090f..6dade03b65 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -554,4 +554,63 @@ SELECT pg_stat_get_live_tuples(:drop_stats_test_subxact_oid);
 
 DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2, trunc_stats_test3, trunc_stats_test4;
 DROP TABLE prevstats;
+-- Test that writes to Shared Buffers are tracked in pg_stat_io
+SELECT sum(write) AS io_sum_shared_writes_before FROM pg_stat_io WHERE io_path = 'Shared' \gset
+CREATE TABLE test_io_shared_writes(a int);
+INSERT INTO test_io_shared_writes SELECT i FROM generate_series(1,100)i;
+CHECKPOINT;
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+SELECT sum(write) AS io_sum_shared_writes_after FROM pg_stat_io WHERE io_path = 'Shared' \gset
+SELECT :io_sum_shared_writes_after > :io_sum_shared_writes_before;
+ ?column? 
+----------
+ t
+(1 row)
+
+DROP TABLE test_io_shared_writes;
+-- Test that extends of temporary tables are tracked in pg_stat_io
+CREATE TEMPORARY TABLE test_io_local_extends(a int);
+SELECT sum(extend) AS io_sum_local_extends_before FROM pg_stat_io WHERE io_path = 'Local' \gset
+INSERT INTO test_io_local_extends VALUES(1);
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+SELECT sum(extend) AS io_sum_local_extends_after FROM pg_stat_io WHERE io_path = 'Local' \gset
+SELECT :io_sum_local_extends_after > :io_sum_local_extends_before;
+ ?column? 
+----------
+ t
+(1 row)
+
+-- Test that, when using a Strategy, reusing buffers from the Strategy ring
+-- count as "Strategy" allocs.
+CREATE TABLE test_io_strategy_stats(a INT, b INT);
+ALTER TABLE test_io_strategy_stats SET (autovacuum_enabled = 'false');
+INSERT INTO test_io_strategy_stats SELECT i, i from generate_series(1,8000)i;
+-- Ensure that the next VACUUM will need to perform IO
+VACUUM (FULL) test_io_strategy_stats;
+SELECT sum(alloc) AS io_sum_strategy_allocs_before FROM pg_stat_io WHERE io_path = 'Strategy' \gset
+VACUUM (PARALLEL 0) test_io_strategy_stats;
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+SELECT sum(alloc) AS io_sum_strategy_allocs_after FROM pg_stat_io WHERE io_path = 'Strategy' \gset
+SELECT :io_sum_strategy_allocs_after > :io_sum_strategy_allocs_before;
+ ?column? 
+----------
+ t
+(1 row)
+
+DROP TABLE test_io_strategy_stats;
 -- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 3f3cf8fb56..fbd3977605 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -285,4 +285,38 @@ SELECT pg_stat_get_live_tuples(:drop_stats_test_subxact_oid);
 
 DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2, trunc_stats_test3, trunc_stats_test4;
 DROP TABLE prevstats;
+
+-- Test that writes to Shared Buffers are tracked in pg_stat_io
+SELECT sum(write) AS io_sum_shared_writes_before FROM pg_stat_io WHERE io_path = 'Shared' \gset
+CREATE TABLE test_io_shared_writes(a int);
+INSERT INTO test_io_shared_writes SELECT i FROM generate_series(1,100)i;
+CHECKPOINT;
+SELECT pg_stat_force_next_flush();
+SELECT sum(write) AS io_sum_shared_writes_after FROM pg_stat_io WHERE io_path = 'Shared' \gset
+SELECT :io_sum_shared_writes_after > :io_sum_shared_writes_before;
+DROP TABLE test_io_shared_writes;
+
+-- Test that extends of temporary tables are tracked in pg_stat_io
+CREATE TEMPORARY TABLE test_io_local_extends(a int);
+SELECT sum(extend) AS io_sum_local_extends_before FROM pg_stat_io WHERE io_path = 'Local' \gset
+INSERT INTO test_io_local_extends VALUES(1);
+SELECT pg_stat_force_next_flush();
+SELECT sum(extend) AS io_sum_local_extends_after FROM pg_stat_io WHERE io_path = 'Local' \gset
+SELECT :io_sum_local_extends_after > :io_sum_local_extends_before;
+
+-- Test that, when using a Strategy, reusing buffers from the Strategy ring
+-- count as "Strategy" allocs.
+CREATE TABLE test_io_strategy_stats(a INT, b INT);
+ALTER TABLE test_io_strategy_stats SET (autovacuum_enabled = 'false');
+INSERT INTO test_io_strategy_stats SELECT i, i from generate_series(1,8000)i;
+-- Ensure that the next VACUUM will need to perform IO
+VACUUM (FULL) test_io_strategy_stats;
+SELECT sum(alloc) AS io_sum_strategy_allocs_before FROM pg_stat_io WHERE io_path = 'Strategy' \gset
+VACUUM (PARALLEL 0) test_io_strategy_stats;
+SELECT pg_stat_force_next_flush();
+SELECT sum(alloc) AS io_sum_strategy_allocs_after FROM pg_stat_io WHERE io_path = 'Strategy' \gset
+SELECT :io_sum_strategy_allocs_after > :io_sum_strategy_allocs_before;
+DROP TABLE test_io_strategy_stats;
+
+
 -- End of Stats Test
-- 
2.34.1

Reply via email to