Am 13.12.21 um 00:41 schrieb Andres Freund:
Hi,

On 2021-12-13 00:00:23 +0100, Gunnar "Nick" Bluth wrote:
Regarding stats size; it adds one PgStat_BackendToastEntry
(PgStat_BackendAttrIdentifier + PgStat_ToastCounts, should be 56-64 bytes or
something in that ballpark) per TOASTable attribute, I can't see that make
any system break sweat ;-)

That's actually a lot. The problem is that all the stats data for a database
is loaded into private memory for each connection to that database, and that
the stats collector regularly writes out all the stats data for a database.

My understanding is that the stats file is only pulled into the backend when the SQL functions (for the view) are used (see "pgstat_fetch_stat_toastentry()").

Otherwise, a backend just initializes an empty hash, right?

Of which I reduced the initial size from 512 to 32 for the below tests (I guess the "truth" lies somewhere in between here), along with making the GUC parameter an actual GUC parameter and disabling the elog() calls I scattered all over the place ;-) for the v0.2 patch attached.

A quick run comparing 1.000.000 INSERTs (2 TOASTable columns each) with and
without "pgstat_track_toast" resulted in 12792.882 ms vs. 12810.557 ms. So
at least the call overhead seems to be neglectible.

Yea, you'd probably need a few more tables and a few more connections for it
to have a chance of mattering meaningfully.

So, I went ahead and
* set up 2 clusters with "track_toast" off and on resp.
* created 100 DBs
 * each with 100 tables
 * with one TOASTable column in each table
 * filling those with 32000 bytes of md5 garbage

These clusters sum up to ~ 2GB each, so differences should _start to_ show up, I reckon.

$ du -s testdb*
2161208 testdb
2163240 testdb_tracking

$ du -s testdb*/pg_stat
4448    testdb/pg_stat
4856    testdb_tracking/pg_stat

The db_*.stat files are 42839 vs. 48767 bytes each (so confirmed, the differences do show).


No idea if this is telling us anything, tbth, but the /proc/<pid>/smaps_rollup for a backend serving one of these DBs look like this ("0 kB" lines omitted):

track_toast OFF
===============
Rss:               12428 kB
Pss:                5122 kB
Pss_Anon:           1310 kB
Pss_File:           2014 kB
Pss_Shmem:          1797 kB
Shared_Clean:       5864 kB
Shared_Dirty:       3500 kB
Private_Clean:      1088 kB
Private_Dirty:      1976 kB
Referenced:        11696 kB
Anonymous:          2120 kB

track_toast ON (view not called yet):
=====================================
Rss:               12300 kB
Pss:                4883 kB
Pss_Anon:           1309 kB
Pss_File:           1888 kB
Pss_Shmem:          1685 kB
Shared_Clean:       6040 kB
Shared_Dirty:       3468 kB
Private_Clean:       896 kB
Private_Dirty:      1896 kB
Referenced:        11572 kB
Anonymous:          2116 kB

track_toast ON (view called):
=============================
Rss:               15408 kB
Pss:                7482 kB
Pss_Anon:           2083 kB
Pss_File:           2572 kB
Pss_Shmem:          2826 kB
Shared_Clean:       6616 kB
Shared_Dirty:       3532 kB
Private_Clean:      1472 kB
Private_Dirty:      3788 kB
Referenced:        14704 kB
Anonymous:          2884 kB

That backend used some memory for displaying the result too, of course...

A backend with just two TOAST columns in one table (filled with 1.000.001 rows) looks like this before and after calling the "pg_stat_toast" view:
Rss:              146208 kB
Pss:              116181 kB
Pss_Anon:           2050 kB
Pss_File:           2787 kB
Pss_Shmem:        111342 kB
Shared_Clean:       6636 kB
Shared_Dirty:      45928 kB
Private_Clean:      1664 kB
Private_Dirty:     91980 kB
Referenced:       145532 kB
Anonymous:          2844 kB

Rss:              147736 kB
Pss:              103296 kB
Pss_Anon:           2430 kB
Pss_File:           3147 kB
Pss_Shmem:         97718 kB
Shared_Clean:       6992 kB
Shared_Dirty:      74056 kB
Private_Clean:      1984 kB
Private_Dirty:     64704 kB
Referenced:       147092 kB
Anonymous:          3224 kB

After creating 10.000 more tables (view shows 10.007 rows now), before and after calling "TABLE pg_stat_toast":
Rss:               13816 kB
Pss:                4898 kB
Pss_Anon:           1314 kB
Pss_File:           1755 kB
Pss_Shmem:          1829 kB
Shared_Clean:       5972 kB
Shared_Dirty:       5760 kB
Private_Clean:       832 kB
Private_Dirty:      1252 kB
Referenced:        13088 kB
Anonymous:          2124 kB

Rss:              126816 kB
Pss:               55213 kB
Pss_Anon:           5383 kB
Pss_File:           2615 kB
Pss_Shmem:         47215 kB
Shared_Clean:       6460 kB
Shared_Dirty:     113028 kB
Private_Clean:      1600 kB
Private_Dirty:      5728 kB
Referenced:       126112 kB
Anonymous:          6184 kB


That DB's stat-file is now 4.119.254 bytes (3.547.439 without track_toast).

After VACUUM ANALYZE, the size goes up to 5.919.812 (5.348.768).
The "100 tables" DBs' go to 97.910 (91.868) bytes.

In total:
$ du -s testdb*/pg_stat
14508   testdb/pg_stat
15472   testdb_tracking/pg_stat


IMHO, this would be ok to at least enable temporarily (e.g. to find out if MAIN or EXTERNAL storage/LZ4 compression would be ok/better for some columns).

All the best,
--
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar.bl...@pro-open.de
__________________________________________________________________________
"Ceterum censeo SystemD esse delendam" - Cato
From aa89c5183d4a9ab99b6a07a456ec12ed3934b930 Mon Sep 17 00:00:00 2001
From: "Gunnar \"Nick\" Bluth" <gunnar.bl...@pro-open.de>
Date: Mon, 13 Dec 2021 14:14:40 +0100
Subject: [PATCH] * make pgstat_track_toast a "real" GUC * reduce size of
 initial hash per backend to 32 * disable DEBUG2 elog() calls

---
 pg_stat_toast.sql                       |  19 ++
 src/backend/access/table/toast_helper.c |  19 ++
 src/backend/postmaster/pgstat.c         | 311 +++++++++++++++++++++++-
 src/backend/utils/adt/pgstatfuncs.c     |  60 +++++
 src/backend/utils/misc/guc.c            |   9 +
 src/include/catalog/pg_proc.dat         |  21 ++
 src/include/pgstat.h                    | 109 +++++++++
 7 files changed, 542 insertions(+), 6 deletions(-)
 create mode 100644 pg_stat_toast.sql

diff --git a/pg_stat_toast.sql b/pg_stat_toast.sql
new file mode 100644
index 0000000000..1c653254ab
--- /dev/null
+++ b/pg_stat_toast.sql
@@ -0,0 +1,19 @@
+-- This creates a useable view, but the offset of 1 is annoying.
+-- That "-1" is probably better done in the helper functions...
+
+CREATE OR REPLACE VIEW pg_stat_toast AS
+ SELECT 
+    n.nspname AS schemaname,
+    a.attrelid AS reloid,
+    a.attnum AS attnum,
+    c.relname AS relname,
+    a.attname AS attname,
+    pg_stat_get_toast_externalizations(a.attrelid,a.attnum -1) AS externalizations,
+    pg_stat_get_toast_compressions(a.attrelid,a.attnum -1) AS compressions,
+    pg_stat_get_toast_compressionsuccesses(a.attrelid,a.attnum -1) AS compressionsuccesses,
+    pg_stat_get_toast_compressedsizesum(a.attrelid,a.attnum -1) AS compressionsizesum,
+    pg_stat_get_toast_originalsizesum(a.attrelid,a.attnum -1) AS originalsizesum
+   FROM pg_attribute a
+   JOIN pg_class c ON c.oid = a.attrelid
+   LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
+  WHERE pg_stat_get_toast_externalizations(a.attrelid,a.attnum -1) IS NOT NULL;
diff --git a/src/backend/access/table/toast_helper.c b/src/backend/access/table/toast_helper.c
index 013236b73d..49545885d5 100644
--- a/src/backend/access/table/toast_helper.c
+++ b/src/backend/access/table/toast_helper.c
@@ -19,6 +19,7 @@
 #include "access/toast_helper.h"
 #include "access/toast_internals.h"
 #include "catalog/pg_type_d.h"
+#include "pgstat.h"
 
 
 /*
@@ -239,6 +240,12 @@ toast_tuple_try_compression(ToastTupleContext *ttc, int attribute)
 			pfree(DatumGetPointer(*value));
 		*value = new_value;
 		attr->tai_colflags |= TOASTCOL_NEEDS_FREE;
+		pgstat_report_toast_activity(ttc->ttc_rel->rd_rel->oid, attribute,
+							false,
+							true,
+							attr->tai_size,
+							VARSIZE(DatumGetPointer(*value)),
+							0);
 		attr->tai_size = VARSIZE(DatumGetPointer(*value));
 		ttc->ttc_flags |= (TOAST_NEEDS_CHANGE | TOAST_NEEDS_FREE);
 	}
@@ -246,6 +253,12 @@ toast_tuple_try_compression(ToastTupleContext *ttc, int attribute)
 	{
 		/* incompressible, ignore on subsequent compression passes */
 		attr->tai_colflags |= TOASTCOL_INCOMPRESSIBLE;
+		pgstat_report_toast_activity(ttc->ttc_rel->rd_rel->oid, attribute,
+							false,
+							true,
+							0,
+							0,
+							0);
 	}
 }
 
@@ -266,6 +279,12 @@ toast_tuple_externalize(ToastTupleContext *ttc, int attribute, int options)
 		pfree(DatumGetPointer(old_value));
 	attr->tai_colflags |= TOASTCOL_NEEDS_FREE;
 	ttc->ttc_flags |= (TOAST_NEEDS_CHANGE | TOAST_NEEDS_FREE);
+	pgstat_report_toast_activity(ttc->ttc_rel->rd_rel->oid, attribute,
+							true,
+							false,
+							0,
+							0,
+							0);
 }
 
 /*
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 7264d2c727..4176a41418 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -106,6 +106,7 @@
 #define PGSTAT_DB_HASH_SIZE		16
 #define PGSTAT_TAB_HASH_SIZE	512
 #define PGSTAT_FUNCTION_HASH_SIZE	512
+#define PGSTAT_TOAST_HASH_SIZE	32
 #define PGSTAT_SUBWORKER_HASH_SIZE	32
 #define PGSTAT_REPLSLOT_HASH_SIZE	32
 
@@ -116,6 +117,7 @@
  */
 bool		pgstat_track_counts = false;
 int			pgstat_track_functions = TRACK_FUNC_OFF;
+bool		pgstat_track_toast = true;
 
 /* ----------
  * Built from GUC parameter
@@ -228,6 +230,19 @@ static HTAB *pgStatFunctions = NULL;
  */
 static bool have_function_stats = false;
 
+/*
+ * Backends store per-toast-column info that's waiting to be sent to the collector
+ * in this hash table (indexed by column's PgStat_BackendAttrIdentifier).
+ */
+static HTAB *pgStatToastActions = NULL;
+
+
+/*
+ * Indicates if backend has some toast stats that it hasn't yet
+ * sent to the collector.
+ */
+static bool have_toast_stats = false;
+
 /*
  * Tuple insertion/deletion counts for an open transaction can't be propagated
  * into PgStat_TableStatus counters until we know if it is going to commit
@@ -328,7 +343,7 @@ static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry
 static void pgstat_write_statsfiles(bool permanent, bool allDbs);
 static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
 static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, HTAB *toasthash,
 									 HTAB *subworkerhash, bool permanent);
 static void backend_read_statsfile(void);
 
@@ -340,6 +355,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
 
 static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
 static void pgstat_send_funcstats(void);
+static void pgstat_send_toaststats(void);
 static void pgstat_send_slru(void);
 static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
 static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
@@ -373,6 +389,7 @@ static void pgstat_recv_wal(PgStat_MsgWal *msg, int len);
 static void pgstat_recv_slru(PgStat_MsgSLRU *msg, int len);
 static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len);
 static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len);
+static void pgstat_recv_toaststat(PgStat_MsgToaststat *msg, int len);
 static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len);
 static void pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len);
 static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len);
@@ -891,7 +908,7 @@ pgstat_report_stat(bool disconnect)
 		pgStatXactCommit == 0 && pgStatXactRollback == 0 &&
 		pgWalUsage.wal_records == prevWalUsage.wal_records &&
 		WalStats.m_wal_write == 0 && WalStats.m_wal_sync == 0 &&
-		!have_function_stats && !disconnect)
+		!have_function_stats && !have_toast_stats && !disconnect)
 		return;
 
 	/*
@@ -983,6 +1000,9 @@ pgstat_report_stat(bool disconnect)
 	/* Now, send function statistics */
 	pgstat_send_funcstats();
 
+	/* Now, send TOAST statistics */
+	pgstat_send_toaststats();
+
 	/* Send WAL statistics */
 	pgstat_send_wal(true);
 
@@ -1116,6 +1136,64 @@ pgstat_send_funcstats(void)
 	have_function_stats = false;
 }
 
+/*
+ * Subroutine for pgstat_report_stat: populate and send a toast stat message
+ */
+static void
+pgstat_send_toaststats(void)
+{
+	/* we assume this inits to all zeroes: */
+	static const PgStat_ToastCounts all_zeroes;
+
+	PgStat_MsgToaststat msg;
+	PgStat_BackendToastEntry *entry;
+	HASH_SEQ_STATUS tstat;
+
+	if (pgStatToastActions == NULL)
+		return;
+
+	pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_TOASTSTAT);
+	msg.m_databaseid = MyDatabaseId;
+	msg.m_nentries = 0;
+
+	hash_seq_init(&tstat, pgStatToastActions);
+	while ((entry = (PgStat_BackendToastEntry *) hash_seq_search(&tstat)) != NULL)
+	{
+		PgStat_ToastEntry *m_ent;
+
+		/* Skip it if no counts accumulated since last time */
+		if (memcmp(&entry->t_counts, &all_zeroes,
+				   sizeof(PgStat_ToastCounts)) == 0)
+			continue;
+
+		/* need to convert format of time accumulators */
+		m_ent = &msg.m_entry[msg.m_nentries];
+		m_ent->attr = entry->attr;
+		m_ent->t_numexternalized = entry->t_counts.t_numexternalized;
+		m_ent->t_numcompressed = entry->t_counts.t_numcompressed;
+		m_ent->t_numcompressionsuccess = entry->t_counts.t_numcompressionsuccess;
+		m_ent->t_size_orig = entry->t_counts.t_size_orig;
+		m_ent->t_size_compressed = entry->t_counts.t_size_compressed;
+		m_ent->t_comp_time = INSTR_TIME_GET_MICROSEC(entry->t_counts.t_comp_time);
+
+		if (++msg.m_nentries >= PGSTAT_NUM_TOASTENTRIES)
+		{
+			pgstat_send(&msg, offsetof(PgStat_MsgToaststat, m_entry[0]) +
+						msg.m_nentries * sizeof(PgStat_ToastEntry));
+			msg.m_nentries = 0;
+		}
+
+		/* reset the entry's counts */
+		MemSet(&entry->t_counts, 0, sizeof(PgStat_ToastCounts));
+	}
+
+	if (msg.m_nentries > 0)
+		pgstat_send(&msg, offsetof(PgStat_MsgToaststat, m_entry[0]) +
+					msg.m_nentries * sizeof(PgStat_ToastEntry));
+
+	have_toast_stats = false;
+}
+
 
 /* ----------
  * pgstat_vacuum_stat() -
@@ -2151,6 +2229,76 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
 	have_function_stats = true;
 }
 
+/*
+ * Report TOAST activity
+ * Called by toast_helper functions.
+ */
+void
+pgstat_report_toast_activity(Oid relid, int attr,
+							bool externalized,
+							bool compressed,
+							int32 old_size,
+							int32 new_size,
+							int32 time_spent)
+{
+	PgStat_BackendAttrIdentifier toastattr = { relid, attr };
+	PgStat_BackendToastEntry *htabent;
+	bool		found;
+
+	if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_toast)
+		return;
+
+	if (!pgStatToastActions)
+	{
+		/* First time through - initialize toast stat table */
+		HASHCTL		hash_ctl;
+
+		hash_ctl.keysize = sizeof(PgStat_BackendAttrIdentifier);
+		hash_ctl.entrysize = sizeof(PgStat_BackendToastEntry);
+		pgStatToastActions = hash_create("TOAST stat entries",
+									  PGSTAT_TOAST_HASH_SIZE,
+									  &hash_ctl,
+									  HASH_ELEM | HASH_BLOBS);
+	}
+
+	/* Get the stats entry for this TOAST attribute, create if necessary */
+	htabent = hash_search(pgStatToastActions, &toastattr,
+						  HASH_ENTER, &found);
+	if (!found)
+	{
+		/* elog(DEBUG2, "No toast entry found for attr %u of relation %u", attr, relid); */
+		MemSet(&htabent->t_counts, 0, sizeof(PgStat_ToastCounts));
+	}
+
+	/* update counters */
+	if (externalized)
+	{
+		htabent->t_counts.t_numexternalized++;
+		/* elog(DEBUG2, "Externalized counter raised for OID %u, attr %u, now %li", relid,attr, htabent->t_counts.t_numexternalized); */
+	}
+	if (compressed)
+	{
+		htabent->t_counts.t_numcompressed++;
+		/* elog(DEBUG2, "Compressed counter raised for OID %u, attr %u, now %li", relid,attr, htabent->t_counts.t_numcompressed); */
+		if (new_size)
+		{
+			htabent->t_counts.t_size_orig+=old_size;
+			/* elog(DEBUG2, "Old size %u added for OID %u, attr %u, now %li",old_size,relid,attr,  htabent->t_counts.t_size_orig); */
+			if (new_size)
+			{
+				htabent->t_counts.t_numcompressionsuccess++;
+				/* elog(DEBUG2, "Compressed success counter raised for OID %u, attr %u, now %li",relid,attr, htabent->t_counts.t_numcompressionsuccess); */
+				htabent->t_counts.t_size_compressed+=new_size;
+				/* elog(DEBUG2, "New size %u added for OID %u, attr %u, now %li",new_size,relid,attr, htabent->t_counts.t_size_compressed); */
+			}
+		}
+		/* TODO: record times */
+	}	
+	
+	/* indicate that we have something to send */
+	have_toast_stats = true;
+}
+
 
 /* ----------
  * pgstat_initstats() -
@@ -3028,6 +3176,35 @@ pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
 	return wentry;
 }
 
+/* ----------
+ * pgstat_fetch_stat_toastentry() -
+ *
+ *	Support function for the SQL-callable pgstat* functions. Returns
+ *	the collected statistics for one TOAST attribute or NULL.
+ * ----------
+ */
+PgStat_StatToastEntry *
+pgstat_fetch_stat_toastentry(Oid rel_id, int attr)
+{
+	PgStat_StatDBEntry *dbentry;
+	PgStat_BackendAttrIdentifier toast_id = { rel_id, attr };
+	PgStat_StatToastEntry *toastentry = NULL;
+
+	/* load the stats file if needed */
+	backend_read_statsfile();
+
+	/* Lookup our database, then find the requested TOAST activity stats.  */
+	dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+	if (dbentry != NULL && dbentry->toastactivity != NULL)
+	{
+		toastentry = (PgStat_StatToastEntry *) hash_search(dbentry->toastactivity,
+														 (void *) &toast_id,
+														 HASH_FIND, NULL);
+	}
+
+	return toastentry;
+}
+
 /*
  * ---------
  * pgstat_fetch_stat_archiver() -
@@ -3708,6 +3885,10 @@ PgstatCollectorMain(int argc, char *argv[])
 					pgstat_recv_funcpurge(&msg.msg_funcpurge, len);
 					break;
 
+				case PGSTAT_MTYPE_TOASTSTAT:
+					pgstat_recv_toaststat(&msg.msg_toaststat, len);
+					break;
+				
 				case PGSTAT_MTYPE_RECOVERYCONFLICT:
 					pgstat_recv_recoveryconflict(&msg.msg_recoveryconflict,
 												 len);
@@ -3852,6 +4033,14 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
 									  PGSTAT_SUBWORKER_HASH_SIZE,
 									  &hash_ctl,
 									  HASH_ELEM | HASH_BLOBS);
+
+	hash_ctl.keysize = sizeof(PgStat_BackendAttrIdentifier);
+	hash_ctl.entrysize = sizeof(PgStat_StatToastEntry);
+	dbentry->toastactivity = hash_create("Per-database TOAST",
+									 PGSTAT_TOAST_HASH_SIZE,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
 }
 
 /*
@@ -4059,8 +4248,8 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
 	while ((dbentry = (PgStat_StatDBEntry *) hash_seq_search(&hstat)) != NULL)
 	{
 		/*
-		 * Write out the table, function, and subscription-worker stats for
-		 * this DB into the appropriate per-DB stat file, if required.
+		 * Write out the table, function, TOAST and subscription-worker stats for this DB into the
+		 * appropriate per-DB stat file, if required.
 		 */
 		if (allDbs || pgstat_db_requested(dbentry->databaseid))
 		{
@@ -4175,9 +4364,11 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
 	HASH_SEQ_STATUS tstat;
 	HASH_SEQ_STATUS fstat;
 	HASH_SEQ_STATUS sstat;
+	HASH_SEQ_STATUS ostat;
 	PgStat_StatTabEntry *tabentry;
 	PgStat_StatFuncEntry *funcentry;
 	PgStat_StatSubWorkerEntry *subwentry;
+	PgStat_StatToastEntry *toastentry;
 	FILE	   *fpout;
 	int32		format_id;
 	Oid			dbid = dbentry->databaseid;
@@ -4243,6 +4434,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
 		(void) rc;				/* we'll check for error with ferror */
 	}
 
+	/*
+	 * Walk through the database's TOAST stats table.
+	 */
+	hash_seq_init(&ostat, dbentry->toastactivity);
+	while ((toastentry = (PgStat_StatToastEntry *) hash_seq_search(&ostat)) != NULL)
+	{
+		fputc('O', fpout);
+		rc = fwrite(toastentry, sizeof(PgStat_StatToastEntry), 1, fpout);
+		(void) rc;				/* we'll check for error with ferror */
+	}
+
 	/*
 	 * No more output to be done. Close the temp file and replace the old
 	 * pgstat.stat with it.  The ferror() check replaces testing for error
@@ -4483,6 +4685,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
 				dbentry->tables = NULL;
 				dbentry->functions = NULL;
 				dbentry->subworkers = NULL;
+				dbentry->toastactivity = NULL;
 
 				/*
 				 * In the collector, disregard the timestamp we read from the
@@ -4528,6 +4731,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
 												  &hash_ctl,
 												  HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
 
+				hash_ctl.keysize = sizeof(PgStat_BackendAttrIdentifier);
+				hash_ctl.entrysize = sizeof(PgStat_StatToastEntry);
+				hash_ctl.hcxt = pgStatLocalContext;
+				dbentry->toastactivity = hash_create("Per-database toast information",
+												 PGSTAT_TOAST_HASH_SIZE,
+												 &hash_ctl,
+												 HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
 				/*
 				 * If requested, read the data from the database-specific
 				 * file.  Otherwise we just leave the hashtables empty.
@@ -4536,6 +4747,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
 					pgstat_read_db_statsfile(dbentry->databaseid,
 											 dbentry->tables,
 											 dbentry->functions,
+											 dbentry->toastactivity,
 											 dbentry->subworkers,
 											 permanent);
 
@@ -4620,7 +4832,7 @@ done:
  * ----------
  */
 static void
-pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, HTAB *toasthash,
 						 HTAB *subworkerhash, bool permanent)
 {
 	PgStat_StatTabEntry *tabentry;
@@ -4629,6 +4841,8 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
 	PgStat_StatFuncEntry *funcentry;
 	PgStat_StatSubWorkerEntry subwbuf;
 	PgStat_StatSubWorkerEntry *subwentry;
+	PgStat_StatToastEntry toastbuf;
+	PgStat_StatToastEntry *toastentry;
 	FILE	   *fpin;
 	int32		format_id;
 	bool		found;
@@ -4777,6 +4991,32 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
 				memcpy(subwentry, &subwbuf, sizeof(subwbuf));
 				break;
 
+
+				/*
+				 * 'O'	A PgStat_StatToastEntry follows (tOast)
+				 */
+			case 'O':
+				if (fread(&toastbuf, 1, sizeof(PgStat_StatToastEntry),
+						  fpin) != sizeof(PgStat_StatToastEntry))
+				{
+					ereport(pgStatRunningInCollector ? LOG : WARNING,
+							(errmsg("corrupted statistics file \"%s\"",
+									statfile)));
+					goto done;
+				}
+
+				/*
+				 * Skip if TOAST data not wanted.
+				 */
+				if (toasthash == NULL)
+					break;
+
+				toastentry = (PgStat_StatToastEntry *) hash_search(toasthash,
+																 (void *) &toastbuf.t_id,
+																 HASH_ENTER, &found);
+				memcpy(toastentry, &toastbuf, sizeof(toastbuf));
+				break;
+
 				/*
 				 * 'E'	The EOF marker of a complete stats file.
 				 */
@@ -5452,6 +5692,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
 			hash_destroy(dbentry->functions);
 		if (dbentry->subworkers != NULL)
 			hash_destroy(dbentry->subworkers);
+		if (dbentry->toastactivity != NULL)
+			hash_destroy(dbentry->toastactivity);
 
 		if (hash_search(pgStatDBHash,
 						(void *) &dbid,
@@ -5491,10 +5733,12 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
 		hash_destroy(dbentry->functions);
 	if (dbentry->subworkers != NULL)
 		hash_destroy(dbentry->subworkers);
-
+	if (dbentry->toastactivity != NULL)
+		hash_destroy(dbentry->toastactivity);
 	dbentry->tables = NULL;
 	dbentry->functions = NULL;
 	dbentry->subworkers = NULL;
+	dbentry->toastactivity = NULL;
 
 	/*
 	 * Reset database-level stats, too.  This creates empty hash tables for
@@ -6152,6 +6396,61 @@ pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
 	}
 }
 
+/* ----------
+ * pgstat_recv_toaststat() -
+ *
+ *	Count what the backend has done.
+ * ----------
+ */
+static void
+pgstat_recv_toaststat(PgStat_MsgToaststat *msg, int len)
+{
+	PgStat_ToastEntry *toastmsg = &(msg->m_entry[0]);
+	PgStat_StatDBEntry *dbentry;
+	PgStat_StatToastEntry *toastentry;
+	int			i;
+	bool		found;
+
+	elog(DEBUG2, "Received TOAST statistics...");
+	dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+	/*
+	 * Process all TOAST entries in the message.
+	 */
+	for (i = 0; i < msg->m_nentries; i++, toastmsg++)
+	{
+		toastentry = (PgStat_StatToastEntry *) hash_search(dbentry->toastactivity,
+														 (void *) &(toastmsg->attr),
+														 HASH_ENTER, &found);
+
+		if (!found)
+		{
+			/*
+			 * If it's a new entry, initialize counters to the values
+			 * we just got.
+			 */
+			elog(DEBUG2, "First time I see this toastentry");
+			toastentry->t_numexternalized = toastmsg->t_numexternalized;
+			toastentry->t_numcompressed = toastmsg->t_numcompressed;
+			toastentry->t_numcompressionsuccess = toastmsg->t_numcompressionsuccess;
+			toastentry->t_size_compressed = toastmsg->t_size_compressed;
+			toastentry->t_size_orig = toastmsg->t_size_orig;
+		}
+		else
+		{
+			/*
+			 * Otherwise add the values to the existing entry.
+			 */
+			elog(DEBUG2, "Found this toastentry, updating");
+			toastentry->t_numexternalized += toastmsg->t_numexternalized;
+			toastentry->t_numcompressed += toastmsg->t_numcompressed;
+			toastentry->t_numcompressionsuccess += toastmsg->t_numcompressionsuccess;
+			toastentry->t_size_compressed += toastmsg->t_size_compressed;
+			toastentry->t_size_orig += toastmsg->t_size_orig;
+		}
+	}
+}
+
 /* ----------
  * pgstat_recv_subworker_error() -
  *
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f529c1561a..bbdcbe14ee 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -410,6 +410,66 @@ pg_stat_get_function_self_time(PG_FUNCTION_ARGS)
 	PG_RETURN_FLOAT8(((double) funcentry->f_self_time) / 1000.0);
 }
 
+Datum
+pg_stat_get_toast_externalizations(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int			attr = PG_GETARG_INT16(1);
+	PgStat_StatToastEntry *toastentry;
+
+	if ((toastentry = pgstat_fetch_stat_toastentry(relid,attr)) == NULL)
+		PG_RETURN_NULL();
+	PG_RETURN_INT64(toastentry->t_numexternalized);
+}
+
+Datum
+pg_stat_get_toast_compressions(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int			attr = PG_GETARG_INT16(1);
+	PgStat_StatToastEntry *toastentry;
+
+	if ((toastentry = pgstat_fetch_stat_toastentry(relid,attr)) == NULL)
+		PG_RETURN_NULL();
+	PG_RETURN_INT64(toastentry->t_numcompressed);
+}
+
+Datum
+pg_stat_get_toast_compressionsuccesses(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int			attr = PG_GETARG_INT16(1);
+	PgStat_StatToastEntry *toastentry;
+
+	if ((toastentry = pgstat_fetch_stat_toastentry(relid,attr)) == NULL)
+		PG_RETURN_NULL();
+	PG_RETURN_INT64(toastentry->t_numcompressionsuccess);
+}
+
+Datum
+pg_stat_get_toast_originalsizesum(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int			attr = PG_GETARG_INT16(1);
+	PgStat_StatToastEntry *toastentry;
+
+	if ((toastentry = pgstat_fetch_stat_toastentry(relid,attr)) == NULL)
+		PG_RETURN_NULL();
+	PG_RETURN_INT64(toastentry->t_size_orig);
+}
+
+Datum
+pg_stat_get_toast_compressedsizesum(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int			attr = PG_GETARG_INT16(1);
+	PgStat_StatToastEntry *toastentry;
+
+	if ((toastentry = pgstat_fetch_stat_toastentry(relid,attr)) == NULL)
+		PG_RETURN_NULL();
+	PG_RETURN_INT64(toastentry->t_size_compressed);
+}
+
 Datum
 pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
 {
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ee6a838b3a..8114b8841d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1540,6 +1540,15 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"track_toast", PGC_SUSET, STATS_COLLECTOR,
+			gettext_noop("Collects statistics on TOAST activity."),
+			NULL
+		},
+		&pgstat_track_toast,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"track_io_timing", PGC_SUSET, STATS_COLLECTOR,
 			gettext_noop("Collects timing statistics for database I/O activity."),
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 79d787cd26..16ea25f433 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5686,6 +5686,27 @@
   proparallel => 'r', prorettype => 'float8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_function_self_time' },
 
+{ oid => '9700', descr => 'statistics: number of TOAST externalizations',
+  proname => 'pg_stat_get_toast_externalizations', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid int4',
+  prosrc => 'pg_stat_get_toast_externalizations' },
+{ oid => '9701', descr => 'statistics: number of TOAST compressions',
+  proname => 'pg_stat_get_toast_compressions', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid int4',
+  prosrc => 'pg_stat_get_toast_compressions' },
+  { oid => '9702', descr => 'statistics: number of successful TOAST compressions',
+  proname => 'pg_stat_get_toast_compressionsuccesses', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid int4',
+  prosrc => 'pg_stat_get_toast_compressionsuccesses' },
+{ oid => '9703', descr => 'statistics: total original size of compressed TOAST data',
+  proname => 'pg_stat_get_toast_originalsizesum', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid int4',
+  prosrc => 'pg_stat_get_toast_originalsizesum' },
+{ oid => '9704', descr => 'statistics: total compressed size of compressed TOAST data',
+  proname => 'pg_stat_get_toast_compressedsizesum', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid int4',
+  prosrc => 'pg_stat_get_toast_compressedsizesum' },
+
 { oid => '3037',
   descr => 'statistics: number of scans done for table/index in current transaction',
   proname => 'pg_stat_get_xact_numscans', provolatile => 'v',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 5b51b58e5a..81b410e612 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -82,10 +82,12 @@ typedef enum StatMsgType
 	PGSTAT_MTYPE_DEADLOCK,
 	PGSTAT_MTYPE_CHECKSUMFAILURE,
 	PGSTAT_MTYPE_REPLSLOT,
+	PGSTAT_MTYPE_CONNECTION,
 	PGSTAT_MTYPE_CONNECT,
 	PGSTAT_MTYPE_DISCONNECT,
 	PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
 	PGSTAT_MTYPE_SUBWORKERERROR,
+	PGSTAT_MTYPE_TOASTSTAT,
 } StatMsgType;
 
 /* ----------
@@ -733,6 +735,80 @@ typedef struct PgStat_MsgDisconnect
 	SessionEndType m_cause;
 } PgStat_MsgDisconnect;
 
+/* ----------
+ * PgStat_BackendAttrIdentifier	Identifier for a single attribute/column (OID + attr)
+ * Used as a hashable identifier for (e.g.) TOAST columns
+ * ----------
+ */
+typedef struct PgStat_BackendAttrIdentifier
+{
+	Oid			relid;
+	int			attr;
+} PgStat_BackendAttrIdentifier;
+
+/* ----------
+ * PgStat_ToastCounts	The actual per-TOAST counts kept by a backend
+ *
+ * This struct should contain only actual event counters, because we memcmp
+ * it against zeroes to detect whether there are any counts to transmit.
+ *
+ * Note that the time counters are in instr_time format here.  We convert to
+ * microseconds in PgStat_Counter format when transmitting to the collector.
+ * ----------
+ */
+typedef struct PgStat_ToastCounts
+{
+	PgStat_Counter t_numexternalized;
+	PgStat_Counter t_numcompressed;
+	PgStat_Counter t_numcompressionsuccess;
+	uint64		   t_size_orig;
+	uint64		   t_size_compressed;
+	instr_time     t_comp_time;
+} PgStat_ToastCounts;
+
+/* ----------
+ * PgStat_BackendToastEntry	Entry in backend's per-toast-attr hash table
+ * ----------
+ */
+typedef struct PgStat_BackendToastEntry
+{
+	PgStat_BackendAttrIdentifier	attr;
+	PgStat_ToastCounts 				t_counts;
+} PgStat_BackendToastEntry;
+
+/* ----------
+ * PgStat_ToastEntry			Per-TOAST-column info in a MsgFuncstat
+ * ----------
+ */
+typedef struct PgStat_ToastEntry
+{
+	PgStat_BackendAttrIdentifier	attr;
+	PgStat_Counter 					t_numexternalized;
+	PgStat_Counter 					t_numcompressed;
+	PgStat_Counter 					t_numcompressionsuccess;
+	uint64		   					t_size_orig;
+	uint64		   					t_size_compressed;
+	PgStat_Counter					t_comp_time;	/* time in microseconds */
+} PgStat_ToastEntry;
+
+/* ----------
+ * PgStat_MsgToaststat			Sent by the backend to report function
+ *								usage statistics.
+ * ----------
+ */
+#define PGSTAT_NUM_TOASTENTRIES	\
+	((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int))  \
+	 / sizeof(PgStat_ToastEntry))
+
+typedef struct PgStat_MsgToaststat
+{
+	PgStat_MsgHdr m_hdr;
+	Oid			m_databaseid;
+	int			m_nentries;
+	PgStat_ToastEntry m_entry[PGSTAT_NUM_TOASTENTRIES];
+} PgStat_MsgToaststat;
+
+
 /* ----------
  * PgStat_Msg					Union over all possible messages.
  * ----------
@@ -760,6 +836,7 @@ typedef union PgStat_Msg
 	PgStat_MsgSLRU msg_slru;
 	PgStat_MsgFuncstat msg_funcstat;
 	PgStat_MsgFuncpurge msg_funcpurge;
+	PgStat_MsgToaststat msg_toaststat;
 	PgStat_MsgRecoveryConflict msg_recoveryconflict;
 	PgStat_MsgDeadlock msg_deadlock;
 	PgStat_MsgTempFile msg_tempfile;
@@ -833,6 +910,7 @@ typedef struct PgStat_StatDBEntry
 	HTAB	   *tables;
 	HTAB	   *functions;
 	HTAB	   *subworkers;
+	HTAB	   *toastactivity;
 } PgStat_StatDBEntry;
 
 
@@ -1022,6 +1100,23 @@ typedef struct PgStat_StatSubWorkerEntry
 	char		last_error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
 } PgStat_StatSubWorkerEntry;
 
+/* ----------
+ * PgStat_StatToastEntry			The collector's data per TOAST attribute
+ * ----------
+ */
+typedef struct PgStat_StatToastEntry
+{
+	PgStat_BackendAttrIdentifier t_id;
+	PgStat_Counter t_numexternalized;
+	PgStat_Counter t_numcompressed;
+	PgStat_Counter t_numcompressionsuccess;
+	uint64		   t_size_orig;
+	uint64		   t_size_compressed;
+
+	PgStat_Counter t_comp_time;	/* time in microseconds */
+} PgStat_StatToastEntry;
+
+
 /*
  * Working state needed to accumulate per-function-call timing statistics.
  */
@@ -1045,6 +1140,7 @@ typedef struct PgStat_FunctionCallUsage
  */
 extern PGDLLIMPORT bool pgstat_track_counts;
 extern PGDLLIMPORT int pgstat_track_functions;
+extern PGDLLIMPORT bool pgstat_track_toast;
 extern char *pgstat_stat_directory;
 extern char *pgstat_stat_tmpname;
 extern char *pgstat_stat_filename;
@@ -1196,12 +1292,22 @@ extern void pgstat_count_heap_delete(Relation rel);
 extern void pgstat_count_truncate(Relation rel);
 extern void pgstat_update_heap_dead_tuples(Relation rel, int delta);
 
+extern void pgstat_count_toast_insert(Relation rel, PgStat_Counter n);
+
 struct FunctionCallInfoBaseData;
 extern void pgstat_init_function_usage(struct FunctionCallInfoBaseData *fcinfo,
 									   PgStat_FunctionCallUsage *fcu);
 extern void pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu,
 									  bool finalize);
 
+extern void
+pgstat_report_toast_activity(Oid relid, int attr,
+							bool externalized,
+							bool compressed,
+							int32 old_size,
+							int32 new_size,
+							int32 time_spent);
+
 extern void AtEOXact_PgStat(bool isCommit, bool parallel);
 extern void AtEOSubXact_PgStat(bool isCommit, int nestDepth);
 
@@ -1228,9 +1334,12 @@ extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
 extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
 extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
 																	Oid subrelid);
+extern PgStat_StatToastEntry *pgstat_fetch_stat_toastentry(Oid rel_id, int attr);
+extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
 extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
 extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
 extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
+
 extern PgStat_GlobalStats *pgstat_fetch_global(void);
 extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
 extern PgStat_SLRUStats *pgstat_fetch_slru(void);
-- 
2.32.0

Reply via email to