Re: POC: Parallel processing of indexes in autovacuum

Daniil Davydov Sun, 25 May 2025 10:23:26 -0700

Hi,

On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <[email protected]> wrote:
>
> On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <[email protected]> wrote:
> >
> > On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <[email protected]> 
> > wrote:
> > >
> > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > would be better to have a more specific name for parallel vacuum such
> > > as autovacuum_max_parallel_workers. This parameter is related to
> > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > affect this parameter.
> >
> > This was my headache when I created names for variables. Autovacuum
> > initially implies parallelism, because we have several parallel a/v
> > workers.
>
> I'm not sure if it's parallelism. We can have multiple autovacuum
> workers simultaneously working on different tables, which seems not
> parallelism to me.


Hm, I didn't thought about the 'parallelism' definition in this way.
But I see your point - the next v4 patch will contain the naming that
you suggest.

>
> > So I think that parameter like
> > `autovacuum_max_parallel_workers` will confuse somebody.
> > If we want to have a more specific name, I would prefer
> > `max_parallel_index_autovacuum_workers`.
>
> It's better not to use 'index' as we're trying to extend parallel
> vacuum to heap scanning/vacuuming as well[1].

OK, I'll fix it.

> > > +   /*
> > > +    * If we are running autovacuum - decide whether we need to process 
> > > indexes
> > > +    * of table with given oid in parallel.
> > > +    */
> > > +   if (AmAutoVacuumWorkerProcess() &&
> > > +       params->index_cleanup != VACOPTVALUE_DISABLED &&
> > > +       RelationAllowsParallelIdxAutovac(rel))
> > >
> > > I think that this should be done in autovacuum code.
> >
> > We need params->index cleanup variable to decide whether we need to
> > use parallel index a/v. In autovacuum.c we have this code :
> > ***
> > /*
> >  * index_cleanup and truncate are unspecified at first in autovacuum.
> >  * They will be filled in with usable values using their reloptions
> >  * (or reloption defaults) later.
> >  */
> > tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> > tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> > ***
> > This variable is filled in inside the `vacuum_rel` function, so I
> > think we should keep the above logic in vacuum.c.
>
> I guess that we can specify the parallel degree even if index_cleanup
> is still UNSPECIFIED. vacuum_rel() would then decide whether to use
> index vacuuming and vacuumlazy.c would decide whether to use parallel
> vacuum based on the specified parallel degree and index_cleanup value.
>
> >
> > > +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
> > >
> > > These fixed values really useful in common cases? I think we already
> > > have an optimization where we skip vacuum indexes if the table has
> > > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
> >
> > When we allocate dead items (and optionally init parallel autocuum) we
> > don't have sane value for `vacrel->lpdead_item_pages` (which should be
> > compared with BYPASS_THRESHOLD_PAGES).
> > The only criterion that we can focus on is the number of dead tuples
> > indicated in the PgStat_StatTabEntry.
>
> My point is that this criterion might not be useful. We have the
> bypass optimization for index vacuuming and having many dead tuples
> doesn't necessarily mean index vacuuming taking a long time. For
> example, even if the table has a few dead tuples, index vacuuming
> could take a very long time and parallel index vacuuming would help
> the situation, if the table is very large and has many indexes.

That sounds reasonable. I'll fix it.

> > But autovacuum (as I think) should work as stable as possible and
> > `unnoticed` by other processes. Thus, we must :
> > 1) Compute resources (such as the number of parallel workers for a
> > single table's indexes vacuuming) as efficiently as possible.
> > 2) Provide a guarantee that as many tables as possible (among
> > requested) will be processed in parallel.
> >
> > (1) can be achieved by calculating the parameters on the fly.
> > NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
> > accurate value in the near future.
>
> I think it requires more things than the number of indexes on the
> table to achieve (1). Suppose that there is a very large table that
> gets updates heavily and has a few indexes. If users want to avoid the
> table from being bloated, it would be a reasonable idea to use
> parallel vacuum during autovacuum and it would not be a good idea to
> disallow using parallel vacuum solely because it doesn't have more
> than 30 indexes. On the other hand, if the table had got many updates
> but not so now, users might want to use resources for autovacuums on
> other tables. We might need to consider autovacuum frequencies per
> table, the statistics of the previous autovacuum, or system loads etc.
> So I think that in order to achieve (1) we might need more statistics
> and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
>

It's hard for me to imagine exactly how extended statistics will help
us track such situations.
It seems that for any of our heuristics, it will be possible to come
up with a counter example.
Maybe we can give advices (via logs) to the user? But for such an
idea, tests should be conducted so that we can understand when
resource consumption becomes ineffective.
I guess that we need to agree on an implementation before conducting such tests.

> > (2) can be achieved by workers reserving - we know that N workers
> > (from bgworkers pool) are *always* at our disposal. And when we use
> > such workers we are not dependent on other operations in the cluster
> > and we don't interfere with other operations by taking resources away
> > from them.
>
> Reserving some bgworkers for autovacuum could make sense. But I think
> it's better to implement it in a general way as it could be useful in
> other use cases too. That is, it might be a good to implement
> infrastructure so that any PostgreSQL code (possibly including
> extensions) can request allocating a pool of bgworkers for specific
> usage and use bgworkers from them.

Reserving infrastructure is an ambitious idea. I am not sure that we
should implement it within this thread and feature.
Maybe we should create a separate thread for it and as a
justification, refer to parallel autovacuum?

-----
Thanks everybody for feedback! I attach a v4 patch to this letter.
Main features :
1) 'parallel_autovacuum_workers' reloption - integer value, that sets
the maximum number of parallel a/v workers that can be taken from
bgworkers pool in order to process this table.
2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
maximum total number of parallel a/v workers, that can be taken from
bgworkers pool.
3) Parallel autovacuum does not try to use thresholds like
NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
4) Parallel autovacuum now can report statistics like "planned vs. launched".
5) For now I got rid of the 'reserving' idea, so now autovacuum
leaders are competing with everyone for parallel workers from the
bgworkers pool.

What do you think about this implementation?

--
Best regards,
Daniil Davydov

From afa3f4c3d8993b775837cd04e5d170012b9d2691 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v4 1/2] Parallel index autovacuum with bgworkers

---
 src/backend/access/common/reloptions.c        | 12 +++
 src/backend/access/heap/vacuumlazy.c          |  6 +-
 src/backend/access/transam/parallel.c         | 11 +++
 src/backend/commands/vacuumparallel.c         | 76 +++++++++++++------
 src/backend/postmaster/autovacuum.c           | 76 ++++++++++++++++++-
 src/backend/utils/init/globals.c              |  1 +
 src/backend/utils/misc/guc_tables.c           | 10 +++
 src/backend/utils/misc/postgresql.conf.sample |  2 +
 src/include/miscadmin.h                       |  1 +
 src/include/postmaster/autovacuum.h           |  4 +
 src/include/utils/guc_hooks.h                 |  2 +
 src/include/utils/rel.h                       | 12 +++
 12 files changed, 186 insertions(+), 27 deletions(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..6ba8da62546 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
 		},
 		SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
 	},
+	{
+		{
+			"parallel_autovacuum_workers",
+			"Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+			"If value is 0 then parallel degree will computed based on number of indexes.",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		-1, -1, 1024
+	},
 	{
 		{
 			"autovacuum_vacuum_threshold",
@@ -1863,6 +1873,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
 		{"autovacuum_enabled", RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+		{"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
 		{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
 		{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f28326bad09..2614ceba139 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3487,6 +3487,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 		autovacuum_work_mem != -1 ?
 		autovacuum_work_mem : maintenance_work_mem;
 
+	int			elevel = AmAutoVacuumWorkerProcess() ||
+		vacrel->verbose ?
+		INFO : DEBUG2;
+
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
 	 * be used for an index, so we invoke parallelism only if there are at
@@ -3513,7 +3517,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 			vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
 											   vacrel->nindexes, nworkers,
 											   vac_work_mem,
-											   vacrel->verbose ? INFO : DEBUG2,
+											   elevel,
 											   vacrel->bstrategy);
 
 		/*
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 94db1ec3012..d3313774a4b 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -34,6 +34,7 @@
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
 #include "pgstat.h"
+#include "postmaster/autovacuum.h"
 #include "storage/ipc.h"
 #include "storage/predicate.h"
 #include "storage/spin.h"
@@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
 	{
 		WaitForParallelWorkersToFinish(pcxt);
 		WaitForParallelWorkersToExit(pcxt);
+
+		/* Release all launched (i.e. reserved) parallel autovacuum workers. */
+		if (AmAutoVacuumWorkerProcess())
+			ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
 		pcxt->nworkers_launched = 0;
 		if (pcxt->known_attached_workers)
 		{
@@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
 	 */
 	HOLD_INTERRUPTS();
 	WaitForParallelWorkersToExit(pcxt);
+
+	/* Release all launched (i.e. reserved) parallel autovacuum workers. */
+	if (AmAutoVacuumWorkerProcess())
+		ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
 	RESUME_INTERRUPTS();
 
 	/* Free the worker array itself. */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..c63830fd2a5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,16 +1,16 @@
 /*-------------------------------------------------------------------------
  *
  * vacuumparallel.c
- *	  Support routines for parallel vacuum execution.
+ *	  Support routines for parallel [auto]vacuum execution.
  *
  * This file contains routines that are intended to support setting up, using,
  * and tearing down a ParallelVacuumState.
  *
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes.  Individual indexes are processed by one
- * vacuum process.  ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area.  We
- * launch parallel worker processes at the start of parallel index
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes.  Individual indexes are processed by
+ * one [auto]vacuum process.  ParallelVacuumState contains shared information
+ * as well as the memory space for storing dead items allocated in the DSA area.
+ * We launch parallel worker processes at the start of parallel index
  * bulk-deletion and index cleanup and once all indexes are processed, the
  * parallel worker processes exit.  Each time we process indexes in parallel,
  * the parallel context is re-initialized so that the same DSM can be used for
@@ -34,6 +34,7 @@
 #include "executor/instrument.h"
 #include "optimizer/paths.h"
 #include "pgstat.h"
+#include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
 } PVIndStats;
 
 /*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
  */
 struct ParallelVacuumState
 {
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
 	shared->relid = RelationGetRelid(rel);
 	shared->elevel = elevel;
 	shared->queryid = pgstat_get_my_query_id();
-	shared->maintenance_work_mem_worker =
-		(nindexes_mwm > 0) ?
-		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
-		maintenance_work_mem;
+
+	if (AmAutoVacuumWorkerProcess())
+		shared->maintenance_work_mem_worker =
+			(nindexes_mwm > 0) ?
+			autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+			autovacuum_work_mem;
+	else
+		shared->maintenance_work_mem_worker =
+			(nindexes_mwm > 0) ?
+			maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+			maintenance_work_mem;
+
 	shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
 
 	/* Prepare DSA space for dead items */
@@ -541,7 +551,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
  *
  * nrequested is the number of parallel workers that user requested.  If
  * nrequested is 0, we compute the parallel degree based on nindexes, that is
- * the number of indexes that support parallel vacuum.  This function also
+ * the number of indexes that support parallel [auto]vacuum.  This function also
  * sets will_parallel_vacuum to remember indexes that participate in parallel
  * vacuum.
  */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
 	 * We don't allow performing parallel operation in standalone backend or
 	 * when parallelism is disabled.
 	 */
-	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+	if (!IsUnderPostmaster ||
+		(max_parallel_autovacuum_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+		(max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
 		return 0;
 
 	/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
 	parallel_workers = (nrequested > 0) ?
 		Min(nrequested, nindexes_parallel) : nindexes_parallel;
 
-	/* Cap by max_parallel_maintenance_workers */
-	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+	/* Cap by GUC variable */
+	parallel_workers = AmAutoVacuumWorkerProcess() ?
+		Min(parallel_workers, max_parallel_autovacuum_workers) :
+		Min(parallel_workers, max_parallel_maintenance_workers);
 
 	return parallel_workers;
 }
 
 /*
  * Perform index vacuum or index cleanup with parallel workers.  This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
  */
 static void
 parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -666,6 +680,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 	/* Reset the parallel index processing and progress counters */
 	pg_atomic_write_u32(&(pvs->shared->idx), 0);
 
+	/* Check how many workers can provide autovacuum. */
+	if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+		nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
 	/* Setup the shared cost-based vacuum delay and launch workers */
 	if (nworkers > 0)
 	{
@@ -690,6 +708,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 
 		LaunchParallelWorkers(pvs->pcxt);
 
+		if (AmAutoVacuumWorkerProcess() &&
+			pvs->pcxt->nworkers_launched < nworkers)
+		{
+			/*
+			 * Tell autovacuum that we could not launch all the previously
+			 * reserved workers.
+			 */
+			ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched - nworkers);
+		}
+
 		if (pvs->pcxt->nworkers_launched > 0)
 		{
 			/*
@@ -706,16 +734,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 
 		if (vacuum)
 			ereport(pvs->shared->elevel,
-					(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
-									 "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+					(errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)",
+									 "launched %d parallel %svacuum workers for index vacuuming (planned: %d)",
 									 pvs->pcxt->nworkers_launched),
-							pvs->pcxt->nworkers_launched, nworkers)));
+							pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
 		else
 			ereport(pvs->shared->elevel,
-					(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
-									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+					(errmsg(ngettext("launched %d parallel %svacuum worker for index cleanup (planned: %d)",
+									 "launched %d parallel %svacuum workers for index cleanup (planned: %d)",
 									 pvs->pcxt->nworkers_launched),
-							pvs->pcxt->nworkers_launched, nworkers)));
+							pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
 	}
 
 	/* Vacuum the indexes that can be processed by only leader process */
@@ -982,8 +1010,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
 /*
  * Perform work within a launched parallel process.
  *
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
  */
 void
 parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 981be42e3af..7f34e202589 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
  * av_workItems		work item array
  * av_nworkersForBalance the number of autovacuum workers to use when
  * 					calculating the per worker cost limit
+ * av_active_parallel_workers the number of active parallel autovacuum workers
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
 	pg_atomic_uint32 av_nworkersForBalance;
+	uint32 av_active_parallel_workers;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -2840,8 +2842,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 */
 		tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
 		tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
-		/* As of now, we don't support parallel vacuum for autovacuum */
-		tab->at_params.nworkers = -1;
+
+		/* Decide whether we need to process indexes of table in parallel. */
+		tab->at_params.nworkers = avopts
+			? avopts->parallel_autovacuum_workers
+			: -1;
+
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3322,6 +3328,61 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
 	return result;
 }
 
+/*
+ * In order to meet the 'max_parallel_autovacuum_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
+	int can_launch;
+
+	/* Only leader worker can call this function. */
+	Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+	if (AutoVacuumShmem->av_active_parallel_workers < nworkers)
+	{
+		/* Provide as many workers as we can. */
+		can_launch = AutoVacuumShmem->av_active_parallel_workers;
+		AutoVacuumShmem->av_active_parallel_workers = 0;
+	}
+	else
+	{
+		/* OK, we can provide all requested workers. */
+		can_launch = nworkers;
+		AutoVacuumShmem->av_active_parallel_workers -= nworkers;
+	}
+	LWLockRelease(AutovacuumLock);
+
+	return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+ParallelAutoVacuumReleaseWorkers(int nworkers)
+{
+	/* Only leader worker can call this function. */
+	Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+	AutoVacuumShmem->av_active_parallel_workers += nworkers;
+	Assert(AutoVacuumShmem->av_active_parallel_workers <=
+		   max_parallel_autovacuum_workers);
+	LWLockRelease(AutovacuumLock);
+}
+
 /*
  * autovac_init
  *		This is called at postmaster initialization.
@@ -3382,6 +3443,8 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
+		AutoVacuumShmem->av_active_parallel_workers =
+			max_parallel_autovacuum_workers;
 		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
@@ -3432,6 +3495,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 	return true;
 }
 
+bool
+check_max_parallel_autovacuum_workers(int *newval, void **extra,
+									  GucSource source)
+{
+	if (*newval >= max_worker_processes)
+		return false;
+	return true;
+}
+
 /*
  * Returns whether there is a free autovacuum worker slot available.
  */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..40a92ceecd5 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int			NBuffers = 16384;
 int			MaxConnections = 100;
 int			max_worker_processes = 8;
 int			max_parallel_workers = 8;
+int         max_parallel_autovacuum_workers = 0;
 int			MaxBackends = 0;
 
 /* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..950b4300100 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_parallel_autovacuum_workers", PGC_POSTMASTER, RESOURCES_WORKER_PROCESSES,
+			gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+			gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+		},
+		&max_parallel_autovacuum_workers,
+		0, 0, MAX_BACKENDS,
+		check_max_parallel_autovacuum_workers, NULL, NULL
+	},
+
 	{
 		{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
 			gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 63f991c4f93..23f5c890f78 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -221,6 +221,8 @@
 #max_parallel_maintenance_workers = 2	# limited by max_parallel_workers
 #max_parallel_workers = 8		# number of max_worker_processes that
 					# can be used in parallel operations
+#max_parallel_autovacuum_workers = 0	# disabled by default and limited by max_parallel_workers
+					# (change requires restart)
 #parallel_leader_participation = on
 
 
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..7c3575b6849 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
 extern PGDLLIMPORT int MaxConnections;
 extern PGDLLIMPORT int max_worker_processes;
 extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int max_parallel_autovacuum_workers;
 
 extern PGDLLIMPORT int commit_timestamp_buffers;
 extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..b5763e6ac36 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
 extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
 								  Oid relationId, BlockNumber blkno);
 
+/* parallel autovacuum stuff */
+extern int ParallelAutoVacuumReserveWorkers(int nworkers);
+extern void ParallelAutoVacuumReleaseWorkers(int nworkers);
+
 /* shared memory stuff */
 extern Size AutoVacuumShmemSize(void);
 extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..d4e6170d45c 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
+extern bool check_max_parallel_autovacuum_workers(int *newval, void **extra,
+												  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
 											GucSource source);
 extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..16091e6a773 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
 typedef struct AutoVacOpts
 {
 	bool		enabled;
+	int			parallel_autovacuum_workers; /* max number of parallel
+												autovacuum workers */
 	int			vacuum_threshold;
 	int			vacuum_max_threshold;
 	int			vacuum_ins_threshold;
@@ -409,6 +411,16 @@ typedef struct StdRdOptions
 	((relation)->rd_options ? \
 	 ((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
 
+/*
+ * RelationGetParallelAutovacuumWorkers
+ *		Returns the relation's parallel_autovacuum_workers reloption setting.
+ *		Note multiple eval of argument!
+ */
+#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
+	((relation)->rd_options ? \
+	 ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
+	 (defaultpw))
+
 /* ViewOptions->check_option values */
 typedef enum ViewOptCheckOption
 {
-- 
2.43.0

From 4a027ce082b0b0964fc2f2f1e7c341adff14f43b Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v4 2/2] Sandbox for parallel index autovacuum

---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/autovacuum/.gitignore        |   1 +
 src/test/modules/autovacuum/Makefile          |  14 ++
 src/test/modules/autovacuum/meson.build       |  12 ++
 .../autovacuum/t/001_autovac_parallel.pl      | 131 ++++++++++++++++++
 src/test/modules/meson.build                  |   1 +
 6 files changed, 160 insertions(+)
 create mode 100644 src/test/modules/autovacuum/.gitignore
 create mode 100644 src/test/modules/autovacuum/Makefile
 create mode 100644 src/test/modules/autovacuum/meson.build
 create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
 SUBDIRS = \
+		  autovacuum \
 		  brin \
 		  commit_ts \
 		  delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'autovacuum',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_autovac_parallel.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..b4022f23948
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+	autovacuum = off
+	max_wal_size = 4096
+	max_worker_processes = 20
+	max_parallel_workers = 20
+	max_parallel_maintenance_workers = 20
+	max_parallel_autovacuum_workers = 10
+	log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+	CREATE TABLE test_autovac (
+		id SERIAL PRIMARY KEY,
+		col_1 INTEGER,  col_2 INTEGER,  col_3 INTEGER,  col_4 INTEGER,  col_5 INTEGER,
+		col_6 INTEGER,  col_7 INTEGER,  col_8 INTEGER,  col_9 INTEGER,  col_10 INTEGER,
+		col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+		col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+		col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+		col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+		col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+		col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+		col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+		col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+		col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+		col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+		col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+		col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+		col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+		col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+		col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+		col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+		col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+		col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+	) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+	DO \$\$
+	DECLARE
+		i INTEGER;
+	BEGIN
+		FOR i IN 1..$indexes_num LOOP
+			EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+		END LOOP;
+	END \$\$;
+});
+
+$node->psql('postgres',
+	"SELECT COUNT(*) FROM pg_index i
+	   JOIN pg_class c ON c.oid = i.indrelid
+	  WHERE c.relname = 'test_autovac';",
+	stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+	DO \$\$
+	DECLARE
+	    i INTEGER;
+	BEGIN
+	    FOR i IN 1..$initial_rows_num LOOP
+	        INSERT INTO test_autovac (
+	            col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+	            col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+	            col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+	            col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+	            col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+	            col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+	            col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+	            col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+	            col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+	            col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+	        ) VALUES (
+	            i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+	            i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+	            i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+	            i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+	            i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+	            i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+	            i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+	            i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+	            i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+	            i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+	        );
+	    END LOOP;
+	END \$\$;
+});
+
+$node->psql('postgres',
+	"SELECT COUNT(*) FROM test_autovac;",
+	stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+	UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+	ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+	autovacuum_naptime = '1s'
+	autovacuum_vacuum_threshold = 1
+	autovacuum_analyze_threshold = 1
+	autovacuum_vacuum_scale_factor = 0.1
+	autovacuum_analyze_scale_factor = 0.1
+	autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
 # Copyright (c) 2022-2025, PostgreSQL Global Development Group
 
+subdir('autovacuum')
 subdir('brin')
 subdir('commit_ts')
 subdir('delay_execution')
-- 
2.43.0

Re: POC: Parallel processing of indexes in autovacuum

Reply via email to