Hi,

I think we need to add some statistics to pg_stat_wal view.

Although there are some parameter related WAL,
there are few statistics for tuning them.

I think it's better to provide the following statistics.
Please let me know your comments.

```
postgres=# SELECT * from pg_stat_wal;
-[ RECORD 1 ]-------+------------------------------
wal_records         | 2000224
wal_fpi             | 47
wal_bytes           | 248216337
wal_buffers_full    | 20954
wal_init_file       | 8
wal_write_backend   | 20960
wal_write_walwriter | 46
wal_write_time      | 51
wal_sync_backend    | 7
wal_sync_walwriter  | 8
wal_sync_time       | 0
stats_reset         | 2020-10-20 11:04:51.307771+09
```

1. Basic statistics of WAL activity

- wal_records: Total number of WAL records generated
- wal_fpi: Total number of WAL full page images generated
- wal_bytes: Total amount of WAL bytes generated

To understand DB's performance, first, we will check the performance
trends for the entire database instance.
For example, if the number of wal_fpi becomes higher, users may tune
"wal_compression", "checkpoint_timeout" and so on.

Although users can check the above statistics via EXPLAIN, auto_explain,
autovacuum and pg_stat_statements now,
if users want to see the performance trends  for the entire database,
they must recalculate the statistics.

I think it is useful to add the sum of the basic statistics.


2.  WAL segment file creation

- wal_init_file: Total number of WAL segment files created.

To create a new WAL file may have an impact on the performance of
a write-heavy workload generating lots of WAL. If this number is reported high, to reduce the number of this initialization, we can tune WAL-related parameters
so that more "recycled" WAL files can be held.



3. Number of when WAL is flushed

- wal_write_backend : Total number of WAL data written to the disk by backends - wal_write_walwriter : Total number of WAL data written to the disk by walwriter - wal_sync_backend : Total number of WAL data synced to the disk by backends - wal_sync_walwriter : Total number of WAL data synced to the disk by walwrite

I think it's useful for tuning "synchronous_commit" and "commit_delay" for query executions. If the number of WAL is flushed is high, users can know "synchronous_commit" is useful for the workload.

Also, it's useful for tuning "wal_writer_delay" and "wal_writer_flush_after" for wal writer.
If the number is high, users can change the parameter for performance.


4.  Wait time when WAL is flushed

- wal_write_time : Total amount of time that has been spent in the portion of WAL data was written to disk by backend and walwriter, in milliseconds (if track-io-timing is enabled, otherwise zero.) - wal_sync_time : Total amount of time that has been spent in the portion of WAL data was synced to disk by backend and walwriter, in milliseconds (if track-io-timing is enabled, otherwise zero.)

If the time becomes much higher, users can detect the possibility of disk failure. Since users can see how much flush time occupies of the query execution time,
it may lead to query tuning and so on.


Best Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 66566765..075156f8 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3406,12 +3406,106 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
     </thead>
 
     <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_records</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL records generated
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_fpi</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL full page images generated
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total amount of WAL bytes generated
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>wal_buffers_full</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of times WAL data was written to the disk because WAL buffers got full
+       Total number of WAL data written to the disk because WAL buffers got full
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_init_file</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL file segment created
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_write_backend</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL data written to the disk by backends
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_write_walwriter</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL data written to the disk by walwriter
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_write_time</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total amount of time that has been spent in the portion of
+       WAL data was written to disk by backend and walwriter, in milliseconds
+       (if <xref linkend="guc-track-io-timing"/> is enabled, otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_sync_backend</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL data synced to the disk by backends
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_sync_walwriter</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total number of WAL data synced to the disk by walwriter
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_sync_time</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Total amount of time that has been spent in the portion of
+       WAL data was synced to disk by backend and walwriter, in milliseconds
+       (if <xref linkend="guc-track-io-timing"/> is enabled, otherwise zero)
       </para></entry>
      </row>
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 52a67b11..4a46b322 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1262,6 +1262,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		pgWalUsage.wal_bytes += rechdr->xl_tot_len;
 		pgWalUsage.wal_records++;
 		pgWalUsage.wal_fpi += num_fpi;
+
+		WalStats.m_wal_bytes += rechdr->xl_tot_len;
+		WalStats.m_wal_records++;
+		WalStats.m_wal_fpi += num_fpi;
 	}
 
 	return EndPos;
@@ -2527,6 +2531,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 			Size		nbytes;
 			Size		nleft;
 			int			written;
+			instr_time	start;
+			instr_time	duration;
 
 			/* OK to write the page(s) */
 			from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
@@ -2535,9 +2541,28 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 			do
 			{
 				errno = 0;
+				if (track_io_timing)
+					INSTR_TIME_SET_CURRENT(start);
+
 				pgstat_report_wait_start(WAIT_EVENT_WAL_WRITE);
 				written = pg_pwrite(openLogFile, from, nleft, startoffset);
 				pgstat_report_wait_end();
+
+				if (track_io_timing)
+				{
+					INSTR_TIME_SET_CURRENT(duration);
+					INSTR_TIME_SUBTRACT(duration, start);
+					WalStats.m_wal_write_time += INSTR_TIME_GET_MILLISEC(duration);
+				}
+
+				if (AmWalWriterProcess()){
+					WalStats.m_wal_write_walwriter++;
+				}
+				else
+				{
+					WalStats.m_wal_write_backend++;
+				}
+
 				if (written <= 0)
 				{
 					char		xlogfname[MAXFNAMELEN];
@@ -3418,6 +3443,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 				 errmsg("could not open file \"%s\": %m", path)));
 
 	elog(DEBUG2, "done creating and filling new WAL file");
+	WalStats.m_wal_init_file++;
 
 	return fd;
 }
@@ -10427,8 +10453,13 @@ assign_xlog_sync_method(int new_sync_method, void *extra)
 void
 issue_xlog_fsync(int fd, XLogSegNo segno)
 {
+	instr_time	start;
+	instr_time	duration;
 	char	   *msg = NULL;
 
+	if (track_io_timing)
+		INSTR_TIME_SET_CURRENT(start);
+
 	pgstat_report_wait_start(WAIT_EVENT_WAL_SYNC);
 	switch (sync_method)
 	{
@@ -10472,6 +10503,21 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
 	}
 
 	pgstat_report_wait_end();
+
+	if (track_io_timing)
+	{
+		INSTR_TIME_SET_CURRENT(duration);
+		INSTR_TIME_SUBTRACT(duration, start);
+		WalStats.m_wal_sync_time += INSTR_TIME_GET_MILLISEC(duration);
+	}
+
+	if (AmWalWriterProcess()){
+		WalStats.m_wal_sync_walwriter++;
+	}
+	else
+	{
+		WalStats.m_wal_sync_backend++;
+	}
 }
 
 /*
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c2939076..4cb31f01 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -990,7 +990,17 @@ CREATE VIEW pg_stat_bgwriter AS
 
 CREATE VIEW pg_stat_wal AS
     SELECT
+        w.wal_records,
+        w.wal_fpi,
+        w.wal_bytes,
         w.wal_buffers_full,
+        w.wal_init_file,
+        w.wal_write_backend,
+        w.wal_write_walwriter,
+        w.wal_write_time,
+        w.wal_sync_backend,
+        w.wal_sync_walwriter,
+        w.wal_sync_time,
         w.stats_reset
     FROM pg_stat_get_wal() w;
 
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 822f0ebc..00df866d 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -6756,7 +6756,17 @@ pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len)
 static void
 pgstat_recv_wal(PgStat_MsgWal *msg, int len)
 {
+	walStats.wal_records += msg->m_wal_records;
+	walStats.wal_fpi += msg->m_wal_fpi;
+	walStats.wal_bytes += msg->m_wal_bytes;
 	walStats.wal_buffers_full += msg->m_wal_buffers_full;
+	walStats.wal_init_file += msg->m_wal_init_file;
+	walStats.wal_write_backend += msg->m_wal_write_backend;
+	walStats.wal_write_walwriter += msg->m_wal_write_walwriter;
+	walStats.wal_write_time += msg->m_wal_write_time;
+	walStats.wal_sync_backend += msg->m_wal_sync_backend;
+	walStats.wal_sync_walwriter += msg->m_wal_sync_walwriter;
+	walStats.wal_sync_time += msg->m_wal_sync_time;
 }
 
 /* ----------
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index a52832fe..ce9f4b7c 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -253,6 +253,9 @@ WalWriterMain(void)
 		else if (left_till_hibernate > 0)
 			left_till_hibernate--;
 
+		/* Send WAL statistics */
+		pgstat_send_wal();
+
 		/*
 		 * Sleep until we are signaled or WalWriterDelay has elapsed.  If we
 		 * haven't done anything useful for quite some time, lengthen the
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 0d0d2e6d..181d719b 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -1703,7 +1703,7 @@ pg_stat_get_buf_alloc(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_wal(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_WAL_COLS	2
+#define PG_STAT_GET_WAL_COLS	12
 	TupleDesc	tupdesc;
 	Datum		values[PG_STAT_GET_WAL_COLS];
 	bool		nulls[PG_STAT_GET_WAL_COLS];
@@ -1715,9 +1715,29 @@ pg_stat_get_wal(PG_FUNCTION_ARGS)
 
 	/* Initialise attributes information in the tuple descriptor */
 	tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_WAL_COLS);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "wal_buffers_full",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "wal_records",
 					   INT8OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "wal_fpi",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 3, "wal_bytes",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 4, "wal_buffers_full",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 5, "wal_init_file",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "wal_write_backend",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "wal_write_walwriter",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "wal_write_time",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 9, "wal_sync_backend",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 10, "wal_sync_walwriter",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 11, "wal_sync_time",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 12, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
 	BlessTupleDesc(tupdesc);
@@ -1726,8 +1746,18 @@ pg_stat_get_wal(PG_FUNCTION_ARGS)
 	wal_stats = pgstat_fetch_stat_wal();
 
 	/* Fill values and NULLs */
-	values[0] = Int64GetDatum(wal_stats->wal_buffers_full);
-	values[1] = TimestampTzGetDatum(wal_stats->stat_reset_timestamp);
+	values[0] = Int64GetDatum(wal_stats->wal_records);
+	values[1] = Int64GetDatum(wal_stats->wal_fpi);
+	values[2] = Int64GetDatum(wal_stats->wal_bytes);
+	values[3] = Int64GetDatum(wal_stats->wal_buffers_full);
+	values[4] = Int64GetDatum(wal_stats->wal_init_file);
+	values[5] = Int64GetDatum(wal_stats->wal_write_backend);
+	values[6] = Int64GetDatum(wal_stats->wal_write_walwriter);
+	values[7] = Int64GetDatum(wal_stats->wal_write_time);
+	values[8] = Int64GetDatum(wal_stats->wal_sync_backend);
+	values[9] = Int64GetDatum(wal_stats->wal_sync_walwriter);
+	values[10] = Int64GetDatum(wal_stats->wal_sync_time);
+	values[11] = TimestampTzGetDatum(wal_stats->stat_reset_timestamp);
 
 	/* Returns the record as Datum */
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 22340baf..4c8b089b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5492,9 +5492,9 @@
 { oid => '1136', descr => 'statistics: information about WAL activity',
   proname => 'pg_stat_get_wal', proisstrict => 'f', provolatile => 's',
   proparallel => 'r', prorettype => 'record', proargtypes => '',
-  proallargtypes => '{int8,timestamptz}',
-  proargmodes => '{o,o}',
-  proargnames => '{wal_buffers_full,stats_reset}',
+  proallargtypes => '{int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,timestamptz}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,wal_init_file,wal_write_backend,wal_write_walwriter,wal_write_time,wal_sync_backend,wal_sync_walwriter,wal_sync_time,stats_reset}',
   prosrc => 'pg_stat_get_wal' },
 
 { oid => '2306', descr => 'statistics: information about SLRU caches',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index a821ff4f..fd424a3d 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -459,7 +459,17 @@ typedef struct PgStat_MsgBgWriter
 typedef struct PgStat_MsgWal
 {
 	PgStat_MsgHdr m_hdr;
+	PgStat_Counter m_wal_records;
+	PgStat_Counter m_wal_fpi;
+	PgStat_Counter m_wal_bytes;
 	PgStat_Counter m_wal_buffers_full;
+	PgStat_Counter m_wal_init_file;
+	PgStat_Counter m_wal_write_backend;
+	PgStat_Counter m_wal_write_walwriter;
+	PgStat_Counter m_wal_write_time;	/* accumulate times in milliseconds */
+	PgStat_Counter m_wal_sync_backend;
+	PgStat_Counter m_wal_sync_walwriter;
+	PgStat_Counter m_wal_sync_time; /* accumulate times in milliseconds */
 } PgStat_MsgWal;
 
 /* ----------
@@ -795,7 +805,18 @@ typedef struct PgStat_GlobalStats
  */
 typedef struct PgStat_WalStats
 {
+	PgStat_Counter wal_records;
+	PgStat_Counter wal_fpi;
+	PgStat_Counter wal_bytes;
 	PgStat_Counter wal_buffers_full;
+	PgStat_Counter wal_init_file;
+	PgStat_Counter wal_write_backend;
+	PgStat_Counter wal_write_walwriter;
+	PgStat_Counter wal_write_time;
+	PgStat_Counter wal_sync_backend;
+	PgStat_Counter wal_sync_walwriter;
+	PgStat_Counter wal_sync_time;
+
 	TimestampTz stat_reset_timestamp;
 } PgStat_WalStats;
 
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index cf2a9b44..33ab6521 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2135,9 +2135,19 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.autoanalyze_count
    FROM pg_stat_all_tables
   WHERE ((pg_stat_all_tables.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_tables.schemaname !~ '^pg_toast'::text));
-pg_stat_wal| SELECT w.wal_buffers_full,
+pg_stat_wal| SELECT w.wal_records,
+    w.wal_fpi,
+    w.wal_bytes,
+    w.wal_buffers_full,
+    w.wal_init_file,
+    w.wal_write_backend,
+    w.wal_write_walwriter,
+    w.wal_write_time,
+    w.wal_sync_backend,
+    w.wal_sync_walwriter,
+    w.wal_sync_time,
     w.stats_reset
-   FROM pg_stat_get_wal() w(wal_buffers_full, stats_reset);
+   FROM pg_stat_get_wal() w(wal_records, wal_fpi, wal_bytes, wal_buffers_full, wal_init_file, wal_write_backend, wal_write_walwriter, wal_write_time, wal_sync_backend, wal_sync_walwriter, wal_sync_time, stats_reset);
 pg_stat_wal_receiver| SELECT s.pid,
     s.status,
     s.receive_start_lsn,

Reply via email to