On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi
<horikyota....@gmail.com> wrote:
>
> At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman 
> <melanieplage...@gmail.com> wrote in
> > So, I've attached an alternate version of the patchset which takes the
> > approach of having one commit which only enables cost-based delay GUC
> > refresh for VACUUM and another commit which enables it for autovacuum
> > and makes the changes to balancing variables.
> >
> > I still think the commit which has workers updating their own
> > wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
> > else emulating our bad behavior and reading from wi_cost_delay without a
> > lock and on no one else deciding to ever write to wi_cost_delay (even
> > though it is in shared memory [this is the same as master]). It is only
> > safe because our process is the only one (right now) writing to
> > wi_cost_delay, so when we read from it without a lock, we know it isn't
> > being written to. And everyone else takes a lock when reading from
> > wi_cost_delay right now. So, it seems...not great.
> >
> > This approach also introduces a function that is only around for
> > one commit until the next commit obsoletes it, which seems a bit silly.
>
> (I'm not sure what this refers to, though..) I don't think it's silly
> if a later patch removes something that the preceding patches
> introdcued, as long as that contributes to readability.  Untimately,
> they will be merged together on committing.

I was under the impression that reviewers thought config reload and
worker balance changes should be committed in separate commits.

Either way, the ephemeral function is not my primary concern. I felt
more uncomfortable with increasing how often we update a double in
shared memory which is read without acquiring a lock.

> > Basically, I think it is probably better to just have one commit
> > enabling guc refresh for VACUUM and then another which correctly
> > implements what is needed for autovacuum to do the same.
> > Attached v9 does this.
> >
> > I've provided both complete versions of both approaches (v9 and v8).
>
> I took a look at v9 and have a few comments.
>
> 0001:
>
> I don't believe it is necessary, as mentioned in the commit
> message. It apperas that we are resetting it at the appropriate times.

VacuumCostBalance must be zeroed out when we disable vacuum cost.
Previously, we did not reenable VacuumCostActive once it was disabled,
but now that we do, I think it is good practice to always zero out
VacuumCostBalance when we disable vacuum cost. I will squash this commit
into the one introducing VacuumCostInactive, though.

>
> 0002:
>
> I felt a bit uneasy on this. It seems somewhat complex (and makes the
> succeeding patches complex),

Even if we introduced a second global variable to indicate that failsafe
mode has been engaged, we would still require the additional checks
of VacuumCostInactive.

> has confusing names,

I would be happy to rename the values of the enum to make them less
confusing. Are you thinking "force" instead of "locked"?
maybe:
VACUUM_COST_FORCE_INACTIVE and
VACUUM_COST_INACTIVE
?

> and doesn't seem like self-contained.

By changing the variable from VacuumCostActive to VacuumCostInactive, I
have kept all non-vacuum code from having to distinguish between it
being inactive due to failsafe mode or due to user settings.

> I think it'd be simpler to add a global boolean (maybe
> VacuumCostActiveForceDisable or such) that forces VacuumCostActive to
> be false and set VacuumCostActive using a setter function that follows
> the boolean.

I think maintaining an additional global variable is more brittle than
including the information in a single variable.

> 0003:
>
> +        * Reload the configuration file if requested. This allows changes to
> +        * vacuum_cost_limit and vacuum_cost_delay to take effect while a 
> table is
> +        * being vacuumed or analyzed. Analyze should not reload configuration
> +        * file if it is in an outer transaction, as GUC values shouldn't be
> +        * allowed to refer to some uncommitted state (e.g. database objects
> +        * created in this transaction).
>
> I'm not sure GUC reload is or should be related to transactions. For
> instance, work_mem can be changed by a reload during a transaction
> unless it has been set in the current transaction. I don't think we
> need to deliberately suppress changes in variables caused by realods
> during transactions only for analzye. If analyze doesn't like changes
> to certain GUC variables, their values should be snapshotted before
> starting the process.

Currently, we only reload the config file in top-level statements. We
don't reload the configuration file from within a nested transaction
command. BEGIN starts a transaction but not a transaction command. So
BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
to just forbid reloading when it is not a top-level transaction command.
I have updated the comment to reflect this.

> 0004:
> -       double          at_vacuum_cost_delay;
> -       int                     at_vacuum_cost_limit;
> +       double          at_table_option_vac_cost_delay;
> +       int                     at_table_option_vac_cost_limit;
>
> We call that options "relopt(ion)". I don't think there's any reason
> to use different names.

I've updated the names.

>         dlist_head      av_runningWorkers;
>         WorkerInfo      av_startingWorker;
>         AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
> +       pg_atomic_uint32 av_nworkers_for_balance;
>
> The name of the new member doesn't seem to follow the surrounding
> convention. (However, I don't think the member is needed. See below.)

I've updated the name to fit the convention better.

> -               /*
> -                * Remember the prevailing values of the vacuum cost GUCs.  
> We have to
> -                * restore these at the bottom of the loop, else we'll 
> compute wrong
> -                * values in the next iteration of autovac_balance_cost().
> -                */
> -               stdVacuumCostDelay = VacuumCostDelay;
> -               stdVacuumCostLimit = VacuumCostLimit;
> +               av_table_option_cost_limit = 
> tab->at_table_option_vac_cost_limit;
> +               av_table_option_cost_delay = 
> tab->at_table_option_vac_cost_delay;
>
> I think this requires a comment.

I've added one.

>
> +               /* There is at least 1 autovac worker (this worker). */
> +               int                     nworkers_for_balance = 
> Max(pg_atomic_read_u32(
> +                                                               
> &AutoVacuumShmem->av_nworkers_for_balance), 1);
>
> I think it *must* be greater than 0. However, to begin with, I don't
> think we need that variable to be shared. I don't believe it matters
> if we count involved workers every time we calculate the delay.

We are not calculating the delay but the cost limit. The cost limit must
be balanced across all of the workers currently actively vacuuming
tables without cost-related table options.

There shouldn't be a way for this to be zero, since this worker calls
autovac_balance_cost() before it starts vacuuming the table. I wanted to
rule out any possibility of a divide by 0 issue. I have changed it to an
assert instead.

> +/*
> + * autovac_balance_cost
> + *             Recalculate the number of workers to consider, given table 
> options and
> + *             the current number of active workers.
> + *
> + * Caller must hold the AutovacuumLock in at least shared mode.
>
> The function name doesn't seem align with what it does. However, I
> mentioned above that it might be unnecessary.

This is the same name as the function had previously. However, I think
it does make sense to rename it. The cost limit must be balanced across
the workers. This function calculated how many workers the cost limit
should be balanced across. I renamed it to
autovac_recalculate_workers_for_balance()

> +AutoVacuumUpdateLimit(void)
>
> If I'm not missing anything, this function does something quite
> different from the original autovac_balance_cost().  The original
> function distributes the total cost based on the GUC variables among
> workers proportionally according to each worker's cost
> parameters. Howevwer, this function distributes the total cost
> equally.

Yes, as I mentioned in the commit message, because all the workers now
have no reason to have different cost parameters (due to reloading the
config file on almost every page), there is no reason to use ratios.
Workers vacuuming a table with no cost-related table options simply need
to divide the limit equally amongst themselves because they all will
have the same limit and delay values.

>
> +               int                     vac_cost_limit = 
> autovacuum_vac_cost_limit > 0 ?
> +               autovacuum_vac_cost_limit : VacuumCostLimit;
> ...
> +               int                     balanced_cost_limit = vac_cost_limit 
> / nworkers_for_balance;
> ...
> +               VacuumCostLimit = Max(Min(balanced_cost_limit, 
> vac_cost_limit), 1);
>         }
>
> This seems to repeatedly divide VacuumCostLimit by
> nworkers_for_balance. I'm not sure, but this function might only be
> called after a reload. If that's the case, I don't think it's safe
> coding, even if it works.

Good point about repeatedly dividing VacuumCostLimit by
nworkers_for_balance. I've added a variable to keep track of the base
cost limit and separated the functionality of updating the limit into
two parts -- one AutoVacuumUpdateLimit() which is only meant to be
called after reload and references VacuumCostLimit to set the
av_base_cost_limit and another, AutoVacuumBalanceLimit(), which only
overrides VacuumCostLimit but uses av_base_cost_limit.

I've noted in the comments that AutoVacuumBalanceLimit() should be
called to adjust to a potential change in nworkers_for_balance
(currently every time after we sleep in vacuum_delay_point()) and
AutoVacuumUpdateLimit() should only be called once after a config
reload, as it references VacuumCostLimit.

I will note that this problem also exists in master, as
autovac_balance_cost references VacuumCostLimit in order to set worker
cost limits and then AutoVacuumUpdateDelay() overrides VacuumCostLimit
with the value calculated in autovac_balance_cost() from
VacuumCostLimit.

v10 attached with mentioned updates.

- Melanie
From cfa73c9086f737ec103b3caa03175b837c5565cb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Sat, 25 Mar 2023 12:05:18 -0400
Subject: [PATCH v10 1/3] Make VacuumCostActive failsafe-aware

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, make vacuum cost status more
expressive.

VacuumCostActive is now VacuumCostInactive, as it can only be active in
one way but it can be inactive in two ways. If performing a failsafe
vacuum, the vacuum cost status cannot be enabled and is effectively
"locked". If performing a non-failsafe vacuum, the vacuum cost status
may be active or inactive. To express this, VacuumCostInactive can be
one of three statuses: VACUUM_COST_INACTIVE_AND_LOCKED,
VACUUM_COST_ACTIVE_AND_LOCKED, and VACUUM_COST_ACTIVE.

VacuumCostInactive is defined as an integer because we do not want
non-vacuum code concerning itself with the distinction between the three
statuses -- only with whether or not VacuumCostInactive == 0 or not.
---
 src/backend/access/heap/vacuumlazy.c  |  2 +-
 src/backend/commands/vacuum.c         | 24 +++++++++++++++++++++---
 src/backend/commands/vacuumparallel.c |  8 ++++++--
 src/backend/storage/buffer/bufmgr.c   |  8 ++++----
 src/backend/utils/init/globals.c      |  2 +-
 src/include/commands/vacuum.h         |  8 ++++++++
 src/include/miscadmin.h               |  3 +--
 7 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..040a4e931b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,7 +2637,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
 		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_LOCKED;
 		VacuumCostBalance = 0;
 
 		return true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..eb126f2247 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -491,7 +491,6 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -507,6 +506,24 @@ vacuum(List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
+			/*
+			 * failsafe_active is reset per relation, so we must be sure that
+			 * VacuumCostInactive is set to either VACUUM_COST_INACTIVE or
+			 * VACUUM_COST_INACTIVE_AND_UNLOCKED in between vacuuming
+			 * relations.
+			 */
+			VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+				VACUUM_COST_INACTIVE_AND_UNLOCKED;
+
+			/*
+			 * We should not have transitioned VacuumCostInactive from
+			 * VACUUM_COST_ACTIVE to VACUUM_COST_INACTIVE_AND_UNLOCKED above,
+			 * as that should have happened when we changed the value of
+			 * VacuumCostDelay.
+			 */
+			Assert(VacuumCostInactive == VACUUM_COST_ACTIVE ||
+				   VacuumCostBalance == 0);
+
 			if (params->options & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, params, false))
@@ -549,7 +566,8 @@ vacuum(List *relations, VacuumParams *params,
 	PG_FINALLY();
 	{
 		in_vacuum = false;
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2215,7 +2233,7 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (VacuumCostInactive || InterruptPending)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..266bf6bb4c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -989,8 +989,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 PARALLEL_VACUUM_KEY_DEAD_ITEMS,
 												 false);
 
-	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	/*
+	 * Set cost-based vacuum delay Parallel vacuum workers will not execute
+	 * failsafe VACUUM.
+	 */
+	VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+		VACUUM_COST_INACTIVE_AND_UNLOCKED;
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 95212a3941..6d3dd26fc7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -893,7 +893,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			*hit = true;
 			VacuumPageHit++;
 
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageHit;
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1098,7 +1098,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	}
 
 	VacuumPageMiss++;
-	if (VacuumCostActive)
+	if (!VacuumCostInactive)
 		VacuumCostBalance += VacuumCostPageMiss;
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1672,7 +1672,7 @@ MarkBufferDirty(Buffer buffer)
 	{
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
-		if (VacuumCostActive)
+		if (!VacuumCostInactive)
 			VacuumCostBalance += VacuumCostPageDirty;
 	}
 }
@@ -4199,7 +4199,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 		{
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageDirty;
 		}
 	}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..608ebb9182 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -150,4 +150,4 @@ int64		VacuumPageMiss = 0;
 int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
-bool		VacuumCostActive = false;
+int			VacuumCostInactive = 1;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..5c3e250b06 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -302,6 +302,14 @@ extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
 /* Variables for cost-based parallel vacuum */
+
+typedef enum VacuumCostStatus
+{
+	VACUUM_COST_INACTIVE_AND_LOCKED = -1,
+	VACUUM_COST_ACTIVE = 0,
+	VACUUM_COST_INACTIVE_AND_UNLOCKED = 1,
+}			VacuumCostStatus;
+
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..33e22733ae 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -274,8 +274,7 @@ extern PGDLLIMPORT int64 VacuumPageMiss;
 extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
-extern PGDLLIMPORT bool VacuumCostActive;
-
+extern PGDLLIMPORT int VacuumCostInactive;
 
 /* in tcop/postgres.c */
 
-- 
2.37.2

From b441bc56a827e0dd5c5078fdf83db79bd9c938ec Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v10 3/3] Autovacuum refreshes cost-based delay params more
 often

The previous commit allowed VACUUM to reload the config file more often
so that cost-based delay parameters could take effect while VACUUMing a
relation. Autovacuum, however did not benefit from this change.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.
---
 src/backend/commands/vacuum.c       |  22 ++-
 src/backend/postmaster/autovacuum.c | 271 +++++++++++++++-------------
 src/include/postmaster/autovacuum.h |   2 +
 3 files changed, 165 insertions(+), 130 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7e3a8e404e..e9b683805a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2241,10 +2241,10 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
-	 * being vacuumed or analyzed. Analyze should not reload configuration file
-	 * if it is in an outer transaction, as we currently only allow
-	 * configuration reload when in top-level statements.
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as we
+	 * currently only allow configuration reload when in top-level statements.
 	 */
 	if (ConfigReloadPending && !analyze_in_outer_xact)
 	{
@@ -2257,10 +2257,12 @@ vacuum_delay_point(void)
 		 * by reload.
 		 */
 		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
 
 		/*
 		 * If configuration changes are allowed to impact VacuumCostInactive,
-		 * make sure it is updated.
+		 * make sure it is updated. Autovacuum workers will have already done
+		 * this in AutoVacuumUpdateDelay()
 		 */
 		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
 			return;
@@ -2311,8 +2313,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumBalanceLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..c8dae5465a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,10 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+static int	av_base_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +193,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +213,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -225,9 +229,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +274,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +289,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +323,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +674,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +824,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1756,9 +1760,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1773,100 +1774,142 @@ FreeWorkerInfo(int code, Datum arg)
 	}
 }
 
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_relopt_cost_delay >= 0)
+		VacuumCostDelay = av_relopt_cost_delay;
+	else if (autovacuum_vac_cost_delay >= 0)
+		VacuumCostDelay = autovacuum_vac_cost_delay;
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostInactive, make
+	 * sure it is updated.
+	 */
+	if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+		return;
+
+	if (VacuumCostDelay > 0)
+		VacuumCostInactive = VACUUM_COST_ACTIVE;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 }
 
+
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * This must be called directly after a config reload before using the value of
+ * VacuumCostLimit and before calling AutoVacuumBalanceLimit(), as it uses the
+ * value of VacuumCostLimit to determine what the base av_base_cost_limit
+ * should be. AutoVacuumBalanceLimit() will override the value of
+ * VacuumCostLimit, so calling it multiple times after a config reload is
+ * incorrect.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
+	{
+		av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
+			autovacuum_vac_cost_limit : VacuumCostLimit;
+
+		AutoVacuumBalanceLimit();
+	}
+}
+
+/*
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkers_for_balance has been updated by
+ * another worker or by the autovacuum launcher. After a config reload, they
+ * must call AutoVacuumUpdateLimit() which will call AutoVacuumBalanceLimit(),
+ * in case VacuumCostLimit was overwritten.
+ */
+void
+AutoVacuumBalanceLimit(void)
+{
+	int			nworkers_for_balance;
+	int			total_cost_limit;
+	int			balanced_cost_limit;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
+	if (!am_autovacuum_worker)
 		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
-	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+	Assert(av_base_cost_limit > 0);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
-	}
+	nworkers_for_balance = pg_atomic_read_u32(
+							&AutoVacuumShmem->av_nworkersForBalance);
+
+	/* There is at least 1 autovac worker (this worker). */
+	Assert(nworkers_for_balance > 0);
+
+	total_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : av_base_cost_limit;
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+	balanced_cost_limit = total_cost_limit / nworkers_for_balance;
+
+	VacuumCostLimit = Max(Min(balanced_cost_limit, total_cost_limit), 1);
+}
+
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
+		return;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL || !worker->wi_dobalance)
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,8 +2355,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2417,30 +2458,26 @@ do_autovacuum(void)
 		}
 
 		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
 		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
-
-		/* do a balance */
-		autovac_balance_cost();
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		AutoVacuumUpdateDelay();
-
-		/* done */
-		LWLockRelease(AutovacuumLock);
+		AutoVacuumUpdateLimit();
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2525,19 +2562,15 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, set wi_dobalance to false on the assumption that we are more
+		 * likely than not to vacuum a table with no table options next, so we
+		 * don't want to give up our share of I/O for a very short interval
+		 * and thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2602,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2804,8 +2839,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2815,20 +2848,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2884,8 +2903,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3377,10 +3398,14 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..80bdfb2cc0 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,7 +64,9 @@ extern int	StartAutoVacWorker(void);
 extern void AutoVacWorkerFailed(void);
 
 /* autovacuum cost-delay balancer */
+extern void AutoVacuumBalanceLimit(void);
 extern void AutoVacuumUpdateDelay(void);
+extern void AutoVacuumUpdateLimit(void);
 
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

From df724144b6fbdc804fb91033bd88df0f82ba6f30 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Mon, 27 Mar 2023 13:33:19 -0400
Subject: [PATCH v10 2/3] VACUUM reloads config file more often

Previously, VACUUM would not reload the configuration file. So, changes
to cost-based delay parameters could only take effect on the next
invocation of VACUUM.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

Note that autovacuum is unaffected by this change. Autovacuum workers
overwrite the value of VacuumCostLimit and VacuumCostDelay with their
own WorkerInfo->wi_cost_limit and wi_cost_delay. Writing to their
wi_cost_delay more often makes reading wi_cost_delay without a lock to
update VacuumCostDelay an even worse idea.

Reviewed-by: Masahiko Sawada <sawada.m...@gmail.com>
Reviewed-by: Daniel Gustafsson <dan...@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c | 61 ++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index eb126f2247..7e3a8e404e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -544,7 +545,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -568,6 +569,7 @@ vacuum(List *relations, VacuumParams *params,
 		in_vacuum = false;
 		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2233,7 +2235,50 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (VacuumCostInactive || InterruptPending)
+	if (InterruptPending ||
+		(VacuumCostInactive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration file
+	 * if it is in an outer transaction, as we currently only allow
+	 * configuration reload when in top-level statements.
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+
+		/*
+		 * Autovacuum workers must restore the correct values of
+		 * VacuumCostLimit and VacuumCostDelay in case they were overwritten
+		 * by reload.
+		 */
+		AutoVacuumUpdateDelay();
+
+		/*
+		 * If configuration changes are allowed to impact VacuumCostInactive,
+		 * make sure it is updated.
+		 */
+		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+			return;
+
+		if (VacuumCostDelay > 0)
+			VacuumCostInactive = VACUUM_COST_ACTIVE;
+		else
+		{
+			VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+			VacuumCostBalance = 0;
+		}
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (VacuumCostInactive)
 		return;
 
 	/*
-- 
2.37.2

Reply via email to