On Thu, Mar 16, 2023 at 07:04:16PM +0400, Ilya Gladyshev wrote:
> > 16 марта 2023 г., в 04:07, Justin Pryzby <pry...@telsasoft.com> написал(а):
> > 
> > On Tue, Mar 14, 2023 at 06:58:14PM +0400, Ilya Gladyshev wrote:
> >>> The only change from the current patch is (3).  (1) still calls
> >>> count_leaf_partitions(), but only once.  I'd prefer that to rearranging
> >>> the progress reporting to set the TOTAL in ProcessUtilitySlow().
> >> 
> >> As for reusing TOTAL calculated outside of DefineIndex, as I can see, 
> >> ProcessUtilitySlow is not the only call site for DefineIndex (although, I 
> >> don’t know whether all of them need progress tracking), for instance, 
> >> there is ALTER TABLE that calls DefineIndex to create index for 
> >> constraints. So I feel like rearranging progress reporting will result in 
> >> unnecessary code duplication in those call sites, so passing in an 
> >> optional parameter seems to be easier here, if we are going to optimize 
> >> it, after all. Especially if back-patching is a non-issue.
> > 
> > Yeah.  See attached.  I don't like duplicating the loop.  Is this really
> > the right direction to go ?
> > 
> > I haven't verified if the child tables are locked in all the paths which
> > would call count_leaf_partitions().  But why is it important to lock
> > them for this?  If they weren't locked before, that'd be a pre-existing
> > problem...
> > <0001-fix-CREATE-INDEX-progress-report-with-nested-partiti.patch>
> 
> I’m not sure what the general policy on locking is, but I have checked ALTER 
> TABLE ADD INDEX, and the all the partitions seem to be locked on the first 
> entry to DefineIndex there. All other call sites pass in the parentIndexId, 
> which means the progress tracking machinery will not be initialized, so I 
> think, we don’t need to do locking in count_leaf_partitions(). 

> The approach in the patch looks good to me. Some nitpicks on the patch: 
> 1. There’s an unnecessary second call to get_rel_relkind in 
> ProcessUtilitySlow, we can just use what’s in the variable relkind.
> 2. We can also combine else and if to have one less nested level like that:
> +                             else if (!RELKIND_HAS_PARTITIONS(child_relkind))
> 
> 3. There was a part of the comment saying "If the index was built by calling 
> DefineIndex() recursively, the called function is responsible for updating 
> the progress report for built indexes.", I think it is still useful to have 
> it there.

Thanks, I addressed (1) and (3).  (2) is deliberate, to allow a place to
put the comment which is not specific to the !HAS_PARTITIONS case.

-- 
Justin
>From 90300bc4ca08088de3a0015ff5c1a251537feacc Mon Sep 17 00:00:00 2001
From: Ilya Gladyshev <ilya.v.gladys...@gmail.com>
Date: Tue, 31 Jan 2023 19:13:07 +0400
Subject: [PATCH 1/3] fix CREATE INDEX progress report with nested partitions

The progress reporting was added in v12 (ab0dfc961) but the original
patch didn't seem to consider the possibility of nested partitioning.

When called recursively, DefineIndex() would clobber the number of
completed partitions, and it was possible to end up with the TOTAL
counter greater than the DONE counter.

This clarifies/re-defines that the progress reporting counts both direct
and indirect children, but doesn't count intermediate partitioned tables:

- The TOTAL counter is set once at the start of the command.
- For indexes which are newly-built, the recursively-called
DefineIndex() increments the DONE counter.
- For pre-existing indexes which are ATTACHed rather than built,
DefineIndex() increments the DONE counter, unless the attached index is
partitioned, in which case progress report is not updated.

Author: Ilya Gladyshev
Reviewed-By: Justin Pryzby, Tomas Vondra, Dean Rasheed, Alvaro Herrera, Matthias van de Meent
Discussion: https://www.postgresql.org/message-id/flat/a15f904a70924ffa4ca25c3c744cff31e0e6e143.camel%40gmail.com
---
 doc/src/sgml/monitoring.sgml                  | 10 ++-
 src/backend/bootstrap/bootparse.y             |  2 +
 src/backend/commands/indexcmds.c              | 74 +++++++++++++++++--
 src/backend/commands/tablecmds.c              |  4 +-
 src/backend/tcop/utility.c                    |  8 +-
 src/backend/utils/activity/backend_progress.c | 28 +++++++
 src/include/commands/defrem.h                 |  1 +
 src/include/utils/backend_progress.h          |  1 +
 8 files changed, 119 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 6249bb50d02..3f891b75541 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6901,7 +6901,10 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </para>
       <para>
        When creating an index on a partitioned table, this column is set to
-       the total number of partitions on which the index is to be created.
+       the total number of partitions on which the index is to be created or attached.
+       In the case of intermediate partitioned tables, this includes both
+       direct and indirect partitions, but excludes the intermediate
+       partitioned tables themselves.
        This field is <literal>0</literal> during a <literal>REINDEX</literal>.
       </para></entry>
      </row>
@@ -6912,7 +6915,10 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </para>
       <para>
        When creating an index on a partitioned table, this column is set to
-       the number of partitions on which the index has been created.
+       the number of partitions on which the index has been created or attached.
+       In the case of intermediate partitioned tables, this includes both
+       direct and indirect partitions, but excludes the intermediate
+       partitioned tables themselves.
        This field is <literal>0</literal> during a <literal>REINDEX</literal>.
       </para></entry>
      </row>
diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y
index 86804bb598e..81a1b7bfec3 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -306,6 +306,7 @@ Boot_DeclareIndexStmt:
 								$4,
 								InvalidOid,
 								InvalidOid,
+								-1,
 								false,
 								false,
 								false,
@@ -358,6 +359,7 @@ Boot_DeclareUniqueIndexStmt:
 								$5,
 								InvalidOid,
 								InvalidOid,
+								-1,
 								false,
 								false,
 								false,
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 16ec0b114e6..2a87c851746 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -130,6 +130,31 @@ typedef struct ReindexErrorInfo
 	char		relkind;
 } ReindexErrorInfo;
 
+
+/*
+ * Count the number of direct and indirect leaf partitions.  Note that this
+ * also excludes foreign tables.  This should be consistent with the loop in
+ * ProcessUtilitySlow().
+ */
+static int
+count_leaf_partitions(Oid relid)
+{
+	int			nleaves = 0;
+	List	   *childs = find_all_inheritors(relid, NoLock, NULL);
+	ListCell   *lc;
+
+	foreach(lc, childs)
+	{
+		Oid			partrelid = lfirst_oid(lc);
+
+		if (RELKIND_HAS_STORAGE(get_rel_relkind(partrelid)))
+			nleaves++;
+	}
+
+	list_free(childs);
+	return nleaves;
+}
+
 /*
  * CheckIndexCompatible
  *		Determine whether an existing index definition is compatible with a
@@ -518,6 +543,7 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress)
  *		case the caller had better have checked it earlier.
  * 'skip_build': make the catalog entries but don't create the index files
  * 'quiet': suppress the NOTICE chatter ordinarily provided for constraints.
+ * 'total_leaves': total number of leaf partitions (excluding foreign tables)
  *
  * Returns the object address of the created index.
  */
@@ -527,6 +553,7 @@ DefineIndex(Oid relationId,
 			Oid indexRelationId,
 			Oid parentIndexId,
 			Oid parentConstraintId,
+			int total_leaves,
 			bool is_alter_table,
 			bool check_rights,
 			bool check_not_in_use,
@@ -1219,8 +1246,22 @@ DefineIndex(Oid relationId,
 			Relation	parentIndex;
 			TupleDesc	parentDesc;
 
-			pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
-										 nparts);
+			/*
+			 * Set the total number of partitions at the start of the command;
+			 * don't update it when being called recursively.
+			 */
+			if (!OidIsValid(parentIndexId))
+			{
+				 /*
+				  * When called by ProcessUtilitySlow(), the number of leaves is
+				  * passed in as an optimization.
+				  */
+				 if (total_leaves < 0)
+					 total_leaves = count_leaf_partitions(relationId);
+
+				pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
+											 total_leaves);
+			}
 
 			/* Make a local copy of partdesc->oids[], just for safety */
 			memcpy(part_oids, partdesc->oids, sizeof(Oid) * nparts);
@@ -1250,6 +1291,7 @@ DefineIndex(Oid relationId,
 			{
 				Oid			childRelid = part_oids[i];
 				Relation	childrel;
+				char		child_relkind;
 				Oid			child_save_userid;
 				int			child_save_sec_context;
 				int			child_save_nestlevel;
@@ -1259,6 +1301,7 @@ DefineIndex(Oid relationId,
 				bool		found = false;
 
 				childrel = table_open(childRelid, lockmode);
+				child_relkind = RelationGetForm(childrel)->relkind;
 
 				GetUserIdAndSecContext(&child_save_userid,
 									   &child_save_sec_context);
@@ -1426,14 +1469,29 @@ DefineIndex(Oid relationId,
 								InvalidOid, /* no predefined OID */
 								indexRelationId,	/* this is our child */
 								createdConstraintId,
+								-1,
 								is_alter_table, check_rights, check_not_in_use,
 								skip_build, quiet);
 					SetUserIdAndSecContext(child_save_userid,
 										   child_save_sec_context);
 				}
+				else
+				{
+					/*
+					 * If the index was built by calling DefineIndex()
+					 * recursively, the called function is responsible for
+					 * updating the progress report for built indexes.
+					 *
+					 * If a pre-existing index was attached, the progress
+					 * report is updated here.  But if the index is
+					 * partitioned, we don't count its partitions, since that's
+					 * expensive, and the ATTACH is fast anyway.
+					 */
+
+					if (!RELKIND_HAS_PARTITIONS(child_relkind))
+						pgstat_progress_incr_param(PROGRESS_CREATEIDX_PARTITIONS_DONE, 1);
+				}
 
-				pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_DONE,
-											 i + 1);
 				free_attrmap(attmap);
 			}
 
@@ -1484,9 +1542,15 @@ DefineIndex(Oid relationId,
 		/* Close the heap and we're done, in the non-concurrent case */
 		table_close(rel, NoLock);
 
-		/* If this is the top-level index, we're done. */
+		/*
+		 * If this is the top-level index, we're done. When called recursively
+		 * for child tables, the done partition counter is incremented now,
+		 * rather than in the caller.
+		 */
 		if (!OidIsValid(parentIndexId))
 			pgstat_progress_end_command();
+		else
+			pgstat_progress_incr_param(PROGRESS_CREATEIDX_PARTITIONS_DONE, 1);
 
 		return address;
 	}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3e2c5f797cd..df2fe468b65 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1216,6 +1216,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 						InvalidOid,
 						RelationGetRelid(idxRel),
 						constraintOid,
+						-1,
 						false, false, false, false, false);
 
 			index_close(idxRel, AccessShareLock);
@@ -8640,6 +8641,7 @@ ATExecAddIndex(AlteredTableInfo *tab, Relation rel,
 						  InvalidOid,	/* no predefined OID */
 						  InvalidOid,	/* no parent index */
 						  InvalidOid,	/* no parent constraint */
+						  -1,			/* total_leaves unknown */
 						  true, /* is_alter_table */
 						  check_rights,
 						  false,	/* check_not_in_use - we did it already */
@@ -18105,7 +18107,7 @@ AttachPartitionEnsureIndexes(Relation rel, Relation attachrel)
 										   &conOid);
 			DefineIndex(RelationGetRelid(attachrel), stmt, InvalidOid,
 						RelationGetRelid(idxRel),
-						conOid,
+						conOid, -1,
 						true, false, false, false, false);
 		}
 
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index eada7353639..bbc3d7ada23 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1465,6 +1465,7 @@ ProcessUtilitySlow(ParseState *pstate,
 					Oid			relid;
 					LOCKMODE	lockmode;
 					bool		is_alter_table;
+					int			total_leaves = 0;
 
 					if (stmt->concurrent)
 						PreventInTransactionBlock(isTopLevel,
@@ -1505,7 +1506,8 @@ ProcessUtilitySlow(ParseState *pstate,
 						inheritors = find_all_inheritors(relid, lockmode, NULL);
 						foreach(lc, inheritors)
 						{
-							char		relkind = get_rel_relkind(lfirst_oid(lc));
+							Oid			partrelid = lfirst_oid(lc);
+							char		relkind = get_rel_relkind(partrelid);
 
 							if (relkind != RELKIND_RELATION &&
 								relkind != RELKIND_MATVIEW &&
@@ -1522,6 +1524,9 @@ ProcessUtilitySlow(ParseState *pstate,
 												stmt->relation->relname),
 										 errdetail("Table \"%s\" contains partitions that are foreign tables.",
 												   stmt->relation->relname)));
+
+							if (RELKIND_HAS_STORAGE(relkind))
+								total_leaves++;
 						}
 						list_free(inheritors);
 					}
@@ -1548,6 +1553,7 @@ ProcessUtilitySlow(ParseState *pstate,
 									InvalidOid, /* no predefined OID */
 									InvalidOid, /* no parent index */
 									InvalidOid, /* no parent constraint */
+									total_leaves,
 									is_alter_table,
 									true,	/* check_rights */
 									true,	/* check_not_in_use */
diff --git a/src/backend/utils/activity/backend_progress.c b/src/backend/utils/activity/backend_progress.c
index d96af812b19..2a9994b98fd 100644
--- a/src/backend/utils/activity/backend_progress.c
+++ b/src/backend/utils/activity/backend_progress.c
@@ -58,6 +58,34 @@ pgstat_progress_update_param(int index, int64 val)
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
 }
 
+/*-----------
+ * pgstat_progress_incr_param() -
+ *
+ * Increment index'th member in st_progress_param[] of the current backend.
+ *-----------
+ */
+void
+pgstat_progress_incr_param(int index, int64 incr)
+{
+	volatile PgBackendStatus *beentry = MyBEEntry;
+	int64		val;
+
+	Assert(index >= 0 && index < PGSTAT_NUM_PROGRESS_PARAM);
+
+	if (!beentry || !pgstat_track_activities)
+		return;
+
+	/*
+	 * Because no other process should write to this backend's own status, we
+	 * can read its value from shared memory without needing to loop to ensure
+	 * its consistency.
+	 */
+	val = beentry->st_progress_param[index];
+	val += incr;
+
+	pgstat_progress_update_param(index, val);
+}
+
 /*-----------
  * pgstat_progress_update_multi_param() -
  *
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 4f7f87fc62c..b774b3028ce 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -29,6 +29,7 @@ extern ObjectAddress DefineIndex(Oid relationId,
 								 Oid indexRelationId,
 								 Oid parentIndexId,
 								 Oid parentConstraintId,
+								 int total_leaves,
 								 bool is_alter_table,
 								 bool check_rights,
 								 bool check_not_in_use,
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 005e5d75ab6..a84752ade99 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -36,6 +36,7 @@ typedef enum ProgressCommandType
 extern void pgstat_progress_start_command(ProgressCommandType cmdtype,
 										  Oid relid);
 extern void pgstat_progress_update_param(int index, int64 val);
+extern void pgstat_progress_incr_param(int index, int64 incr);
 extern void pgstat_progress_update_multi_param(int nparam, const int *index,
 											   const int64 *val);
 extern void pgstat_progress_end_command(void);
-- 
2.34.1

>From a3965c2025062c5dc7080c1025e530628a6180cf Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryz...@telsasoft.com>
Date: Wed, 1 Feb 2023 10:23:53 -0600
Subject: [PATCH 2/3] assertions for progress reporting

---
 src/backend/commands/analyze.c                | 10 ++-
 src/backend/utils/activity/backend_progress.c | 84 +++++++++++++++++++
 2 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 65750958bb2..3bfc941aa2c 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1145,6 +1145,12 @@ acquire_sample_rows(Relation onerel, int elevel,
 	TableScanDesc scan;
 	BlockNumber nblocks;
 	BlockNumber blksdone = 0;
+	int64		progress_vals[2] = {0};
+	int const	progress_inds[2] = {
+		PROGRESS_ANALYZE_BLOCKS_DONE,
+		PROGRESS_ANALYZE_BLOCKS_TOTAL
+	};
+
 #ifdef USE_PREFETCH
 	int			prefetch_maximum = 0;	/* blocks to prefetch if enabled */
 	BlockSamplerData prefetch_bs;
@@ -1169,8 +1175,8 @@ acquire_sample_rows(Relation onerel, int elevel,
 #endif
 
 	/* Report sampling block numbers */
-	pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_TOTAL,
-								 nblocks);
+	progress_vals[1] = nblocks;
+	pgstat_progress_update_multi_param(2, progress_inds, progress_vals);
 
 	/* Prepare for sampling rows */
 	reservoir_init_selection_state(&rstate, targrows);
diff --git a/src/backend/utils/activity/backend_progress.c b/src/backend/utils/activity/backend_progress.c
index 2a9994b98fd..63f9482b175 100644
--- a/src/backend/utils/activity/backend_progress.c
+++ b/src/backend/utils/activity/backend_progress.c
@@ -10,6 +10,7 @@
  */
 #include "postgres.h"
 
+#include "commands/progress.h"
 #include "port/atomics.h"		/* for memory barriers */
 #include "utils/backend_progress.h"
 #include "utils/backend_status.h"
@@ -37,6 +38,85 @@ pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
 }
 
+/*
+ * Check for consistency of progress data (current < total).
+ *
+ * Check during pgstat_progress_updates_*() rather than only from
+ * pgstat_progress_end_command() to catch issues with uninitialized/stale data
+ * from previous progress commands.
+ *
+ * If a command fails due to interrupt or error, the values may be less than
+ * the expected final value.
+ */
+static void
+pgstat_progress_asserts(void)
+{
+	volatile PgBackendStatus *beentry = MyBEEntry;
+	volatile int64 *a = beentry->st_progress_param;
+
+	switch (beentry->st_progress_command)
+	{
+		case PROGRESS_COMMAND_VACUUM:
+			Assert(a[PROGRESS_VACUUM_HEAP_BLKS_SCANNED] <=
+				   a[PROGRESS_VACUUM_TOTAL_HEAP_BLKS]);
+			Assert(a[PROGRESS_VACUUM_HEAP_BLKS_VACUUMED] <=
+				   a[PROGRESS_VACUUM_TOTAL_HEAP_BLKS]);
+			Assert(a[PROGRESS_VACUUM_NUM_DEAD_TUPLES] <=
+				   a[PROGRESS_VACUUM_MAX_DEAD_TUPLES]);
+			break;
+
+		case PROGRESS_COMMAND_ANALYZE:
+			Assert(a[PROGRESS_ANALYZE_BLOCKS_DONE] <=
+				   a[PROGRESS_ANALYZE_BLOCKS_TOTAL]);
+			Assert(a[PROGRESS_ANALYZE_EXT_STATS_COMPUTED] <=
+				   a[PROGRESS_ANALYZE_EXT_STATS_TOTAL]);
+			Assert(a[PROGRESS_ANALYZE_CHILD_TABLES_DONE] <=
+				   a[PROGRESS_ANALYZE_CHILD_TABLES_TOTAL]);
+			break;
+
+		case PROGRESS_COMMAND_CLUSTER:
+			Assert(a[PROGRESS_CLUSTER_HEAP_BLKS_SCANNED] <=
+				   a[PROGRESS_CLUSTER_TOTAL_HEAP_BLKS]);
+			/* FALLTHROUGH */
+			/* ..because CLUSTER rebuilds indexes */
+
+		case PROGRESS_COMMAND_CREATE_INDEX:
+			Assert(a[PROGRESS_CREATEIDX_TUPLES_DONE] <=
+				   a[PROGRESS_CREATEIDX_TUPLES_TOTAL]);
+			Assert(a[PROGRESS_CREATEIDX_PARTITIONS_DONE] <=
+				   a[PROGRESS_CREATEIDX_PARTITIONS_TOTAL]);
+			break;
+
+		case PROGRESS_COMMAND_BASEBACKUP:
+			/* progress reporting is optional for these */
+			if (a[PROGRESS_BASEBACKUP_BACKUP_TOTAL] >= 0)
+			{
+				Assert(a[PROGRESS_BASEBACKUP_BACKUP_STREAMED] <=
+					   a[PROGRESS_BASEBACKUP_BACKUP_TOTAL]);
+				Assert(a[PROGRESS_BASEBACKUP_TBLSPC_STREAMED] <=
+					   a[PROGRESS_BASEBACKUP_TBLSPC_TOTAL]);
+			}
+			break;
+
+		case PROGRESS_COMMAND_COPY:
+#if 0
+			//This currently fails file_fdw tests, since pgstat_prorgress evidently fails
+				// to support simultaneous copy commands, as happens during JOIN.
+			/* bytes progress is not available in all cases */
+				if (a[PROGRESS_COPY_BYTES_TOTAL] > 0)
+				//Assert(a[PROGRESS_COPY_BYTES_PROCESSED] <= a[PROGRESS_COPY_BYTES_TOTAL]);
+			if (a[PROGRESS_COPY_BYTES_PROCESSED] > a[PROGRESS_COPY_BYTES_TOTAL])
+				elog(WARNING, "PROGRESS_COPY_BYTES_PROCESSED %ld %ld",
+					 a[PROGRESS_COPY_BYTES_PROCESSED],
+					 a[PROGRESS_COPY_BYTES_TOTAL]);
+#endif
+			break;
+
+		case PROGRESS_COMMAND_INVALID:
+			break;				/* Do nothing */
+	}
+}
+
 /*-----------
  * pgstat_progress_update_param() -
  *
@@ -56,6 +136,8 @@ pgstat_progress_update_param(int index, int64 val)
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 	beentry->st_progress_param[index] = val;
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
+
+	pgstat_progress_asserts();
 }
 
 /*-----------
@@ -113,6 +195,8 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
 	}
 
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
+
+	pgstat_progress_asserts();
 }
 
 /*-----------
-- 
2.34.1

>From 84388375d4808d44877fc9e26eacf9219158be2b Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryz...@telsasoft.com>
Date: Sat, 21 Jan 2023 21:41:04 -0600
Subject: [PATCH 3/3] f! also assert that progress values don't go backwards
 and the total is constant

pgstat_progress_update_multi_param() is being abused as a way to update
a value which might otherwise violate the assertion.

See also:
https://www.postgresql.org/message-id/CA%2BTgmoYSvEP3weQaCPGf6%2BDXLy2__JbJUYtoXyWP%3DqHcyGbihA%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c          | 37 +++++++++
 src/backend/storage/lmgr/lmgr.c               | 24 +++---
 src/backend/utils/activity/backend_progress.c | 81 +++++++++++++++++++
 3 files changed, 130 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f38..61cfbf6a17a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1042,6 +1042,13 @@ lazy_scan_heap(LVRelState *vacrel)
 
 				/* Forget the LP_DEAD items that we just vacuumed */
 				dead_items->num_items = 0;
+				{
+					const int	progress_inds[] = {PROGRESS_VACUUM_NUM_DEAD_TUPLES};
+					const int64 progress_vals[] = {0};
+
+					pgstat_progress_update_multi_param(1, progress_inds, progress_vals);
+				}
+
 
 				/*
 				 * Periodically perform FSM vacuuming to make newly-freed
@@ -2199,6 +2206,13 @@ lazy_vacuum(LVRelState *vacrel)
 	{
 		Assert(!vacrel->do_index_cleanup);
 		vacrel->dead_items->num_items = 0;
+		{
+			const int	progress_inds[] = {PROGRESS_VACUUM_NUM_DEAD_TUPLES};
+			const int64 progress_vals[] = {0};
+
+			pgstat_progress_update_multi_param(1, progress_inds, progress_vals);
+		}
+
 		return;
 	}
 
@@ -2301,6 +2315,13 @@ lazy_vacuum(LVRelState *vacrel)
 	 * vacuum)
 	 */
 	vacrel->dead_items->num_items = 0;
+
+	{
+		const int	progress_inds[] = {PROGRESS_VACUUM_NUM_DEAD_TUPLES};
+		const int64 progress_vals[] = {0};
+
+		pgstat_progress_update_multi_param(1, progress_inds, progress_vals);
+	}
 }
 
 /*
@@ -2414,12 +2435,23 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
+#if 0
+	const int	progress_inds[] = {
+		PROGRESS_VACUUM_PHASE,
+		PROGRESS_VACUUM_NUM_DEAD_TUPLES,
+	};
+	const int64 progress_vals[] = {
+		PROGRESS_VACUUM_PHASE_VACUUM_HEAP,
+		0,
+	};
+#endif
 
 	Assert(vacrel->do_index_vacuuming);
 	Assert(vacrel->do_index_cleanup);
 	Assert(vacrel->num_index_scans > 0);
 
 	/* Report that we are now vacuuming the heap */
+	//pgstat_progress_update_multi_param(2, progress_inds, progress_vals);
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
 
@@ -3190,7 +3222,12 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	dead_items = (VacDeadItems *) palloc(vac_max_items_to_alloc_size(max_items));
 	dead_items->max_items = max_items;
 	dead_items->num_items = 0;
+	{
+		const int	progress_inds[] = {PROGRESS_VACUUM_NUM_DEAD_TUPLES};
+		const int64 progress_vals[] = {0};
 
+		pgstat_progress_update_multi_param(1, progress_inds, progress_vals);
+	}
 	vacrel->dead_items = dead_items;
 }
 
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index ee9b89a6726..a3acdce5ff5 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -912,6 +912,13 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 	int			total = 0;
 	int			done = 0;
 
+	const int	progress_index[] = {
+		PROGRESS_WAITFOR_TOTAL,
+		PROGRESS_WAITFOR_DONE,
+		PROGRESS_WAITFOR_CURRENT_PID
+	};
+	const int64 progress_values[] = {0, 0, 0};
+
 	/* Done if no locks to wait for */
 	if (locktags == NIL)
 		return;
@@ -930,7 +937,10 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 	}
 
 	if (progress)
+	{
+		pgstat_progress_update_multi_param(3, progress_index, progress_values);
 		pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, total);
+	}
 
 	/*
 	 * Note: GetLockConflicts() never reports our own xid, hence we need not
@@ -960,19 +970,9 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 				pgstat_progress_update_param(PROGRESS_WAITFOR_DONE, ++done);
 		}
 	}
+
 	if (progress)
-	{
-		const int	index[] = {
-			PROGRESS_WAITFOR_TOTAL,
-			PROGRESS_WAITFOR_DONE,
-			PROGRESS_WAITFOR_CURRENT_PID
-		};
-		const int64 values[] = {
-			0, 0, 0
-		};
-
-		pgstat_progress_update_multi_param(3, index, values);
-	}
+		pgstat_progress_update_multi_param(3, progress_index, progress_values);
 
 	list_free_deep(holders);
 }
diff --git a/src/backend/utils/activity/backend_progress.c b/src/backend/utils/activity/backend_progress.c
index 63f9482b175..41ad6884521 100644
--- a/src/backend/utils/activity/backend_progress.c
+++ b/src/backend/utils/activity/backend_progress.c
@@ -117,6 +117,78 @@ pgstat_progress_asserts(void)
 	}
 }
 
+/*
+ * Check that newval >= oldval, and that when total is not being set twice.
+ */
+static void
+pgstat_progress_assert_forward_progress(int command, int index,
+										int64 oldval, int64 newval)
+{
+	switch (command)
+	{
+		case PROGRESS_COMMAND_ANALYZE:
+			/*
+			 * phase goes backwards for inheritance tables, which are sampled
+			 * twice
+			 */
+			if (index != PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID &&
+				index != PROGRESS_ANALYZE_PHASE)
+				Assert(newval >= oldval);
+			if (index == PROGRESS_ANALYZE_BLOCKS_TOTAL ||
+				index == PROGRESS_ANALYZE_EXT_STATS_TOTAL ||
+				index == PROGRESS_ANALYZE_CHILD_TABLES_TOTAL)
+				Assert(oldval == 0);
+			break;
+
+		case PROGRESS_COMMAND_CLUSTER:
+			if (index != PROGRESS_CLUSTER_INDEX_RELID)
+				Assert(newval >= oldval);
+			if (index == PROGRESS_CLUSTER_TOTAL_HEAP_BLKS)
+				Assert(oldval == 0);
+			break;
+
+		case PROGRESS_COMMAND_CREATE_INDEX:
+			if (index != PROGRESS_CREATEIDX_INDEX_OID &&
+				index != PROGRESS_CREATEIDX_SUBPHASE &&
+				index != PROGRESS_WAITFOR_CURRENT_PID &&
+				index != PROGRESS_CREATEIDX_ACCESS_METHOD_OID)
+				Assert(newval >= oldval);
+
+			if (index == PROGRESS_CREATEIDX_TUPLES_TOTAL ||
+				index == PROGRESS_CREATEIDX_PARTITIONS_TOTAL)
+				Assert(oldval == 0);
+			break;
+
+		case PROGRESS_COMMAND_BASEBACKUP:
+			if (index == PROGRESS_BASEBACKUP_BACKUP_TOTAL &&
+				oldval == 0 && newval == -1)
+				return;			/* Do nothing: this is the initial "null"
+								 * state before the size is estimated */
+			Assert(newval >= oldval);
+
+			if (index == PROGRESS_BASEBACKUP_BACKUP_TOTAL ||
+				index == PROGRESS_BASEBACKUP_TBLSPC_TOTAL)
+				Assert(oldval == 0);
+			break;
+
+		case PROGRESS_COMMAND_COPY:
+			Assert(newval >= oldval);
+			if (index == PROGRESS_COPY_BYTES_TOTAL)
+				Assert(oldval == 0);
+			break;
+		case PROGRESS_COMMAND_VACUUM:
+			Assert(newval >= oldval);
+			if (index == PROGRESS_VACUUM_TOTAL_HEAP_BLKS)
+				Assert(oldval == 0);
+			if (index == PROGRESS_VACUUM_MAX_DEAD_TUPLES)
+				Assert(oldval == 0);
+			break;
+
+		case PROGRESS_COMMAND_INVALID:
+			break;
+	}
+}
+
 /*-----------
  * pgstat_progress_update_param() -
  *
@@ -133,6 +205,15 @@ pgstat_progress_update_param(int index, int64 val)
 	if (!beentry || !pgstat_track_activities)
 		return;
 
+	if (index != PROGRESS_SCAN_BLOCKS_DONE)
+	{
+		/* Check that progress does not go backwards */
+		int64		oldval = beentry->st_progress_param[index];
+
+		pgstat_progress_assert_forward_progress(beentry->st_progress_command,
+												index, oldval, val);
+	}
+
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 	beentry->st_progress_param[index] = val;
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
-- 
2.34.1

Reply via email to