Hi, When vacuum is running for hours on a table with many indexes, we can see what phase it is in and how many index vacuum cycles have happened (index_vacuum_count), but not how much time each phase has taken. That makes it hard to tell what is actually slow.
The attached patch adds three columns to pg_stat_progress_vacuum that report the total time spent in heap vacuuming, index vacuuming, and index cleanup. The values are accumulated across overflow cycles when the dead TID store fills up. These columns make it easy to see whether the bottleneck is heap scanning or index processing, analyze why vacuum is slow, and adjust maintenance_work_mem accordingly to reduce TID store overflow cycles. They also make it possible to estimate how long vacuum takes to finish - polling these timing columns along with scanned blocks gives a good estimated time. We could put this behind a track_vacuum_timing GUC similar to track_io_timing and track_wal_timing of the time capturing calls seem costly. I'd prefer not to add another GUC, but I'm open to thoughts on this. For example, I tested with a 3M-row table (3.9 GB), 4 btree indexes, and maintenance_work_mem=256kB (kept low to force TID-store overflows). About 34 minutes in, only 13% through the vacuum [1]. index_vacuum_time is 2006s vs heap_vacuum_time of 2.1s - index processing is where all the time goes. 13,188 overflow cycles × 4 indexes = 52,752 index scans so far. In production, tables with more indexes, these columns show exactly where vacuum time is going and whether raising maintenance_work_mem would help. With the help of the attached sql function, the estimated time to finish the vacuum is about 3.5 hours [2]. Thoughts? [1] SELECT * FROM pg_stat_progress_vacuum; pid | 18764 relid | 16384 phase | vacuuming indexes heap_blks_total | 500000 heap_blks_scanned | 66202 heap_blks_vacuumed | 66202 index_vacuum_count | 13188 max_dead_tuple_bytes | 262144 dead_tuple_bytes | 0 num_dead_item_ids | 0 indexes_total | 4 indexes_processed | 4 delay_time | 0 mode | manual started_by | user heap_vacuum_time | 2.098345 index_vacuum_time | 2005.931383 index_cleanup_time | 0 [2] SELECT * FROM estimate_vacuum_completion(); pid | 18764 relname | vac_timing_test pct_done | 13.24 time_remaining | 03:35:39 -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com
From b266993a0d61c3f46da32730086b7d538150f4fb Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <[email protected]> Date: Sun, 19 Apr 2026 00:01:13 +0000 Subject: [PATCH v1] Add heap and index vacuum timings to pg_stat_progress_vacuum When vacuum is running for hours on a table with many indexes, we can see what phase it is in and how many index vacuum cycles have happened (index_vacuum_count), but not how much time each phase has taken. That makes it hard to tell what is actually slow. This commit adds three columns to pg_stat_progress_vacuum that report the total time spent in heap vacuuming, index vacuuming, and index cleanup. The values carry over across overflow cycles when the dead TID store fills up and vacuum restarts. These columns make it easy to see whether the bottleneck is heap scanning or index processing, analyze why vacuum is slow, and adjust maintenance_work_mem accordingly to reduce TID store overflow cycles. These columns also make it possible to estimate how long vacuum takes to finish. Polling these timing columns along with scanned blocks gives a good estimated time to finish vacuum. --- doc/src/sgml/monitoring.sgml | 36 +++++++++++++++ src/backend/access/heap/vacuumlazy.c | 68 ++++++++++++++++++++++++++++ src/backend/catalog/system_views.sql | 5 +- src/include/commands/progress.h | 3 ++ src/test/regress/expected/rules.out | 5 +- 5 files changed, 115 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 08d5b824552..7912b9fce7f 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -7577,6 +7577,42 @@ FROM pg_stat_get_backend_idset() AS backendid; </itemizedlist> </para></entry> </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>heap_vacuum_time</structfield> <type>double precision</type> + </para> + <para> + Total time spent in heap vacuuming phase, in seconds (see + <xref linkend="vacuum-phases"/>). This includes time spent scanning + the heap to collect dead TIDs, plus the time spent in the heap + vacuuming pass (removing dead items from heap pages). The time is + accumulated across multiple passes when the TID store fills up. + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>index_vacuum_time</structfield> <type>double precision</type> + </para> + <para> + Total time spent in the index vacuuming phase, in seconds (see + <xref linkend="vacuum-phases"/>). This includes the time to vacuum + all indexes on the table in each pass, and is accumulated across + multiple passes when all indexes have to be vacuumed again. + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>index_cleanup_time</structfield> <type>double precision</type> + </para> + <para> + Total time spent in the index cleanup phase, in seconds (see + <xref linkend="vacuum-phases"/>). This includes the time to + clean up all indexes on the table. + </para></entry> + </row> </tbody> </tgroup> </table> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 39395aed0d5..bdcf424e42c 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -345,6 +345,12 @@ typedef struct LVRelState int num_dead_items_resets; Size total_dead_items_bytes; + /* Per-phase elapsed time tracking (in microseconds) */ + instr_time phase_start_time; /* when the current phase started */ + int64 heap_vacuum_elapsed_us; + int64 index_vacuum_elapsed_us; + int64 index_cleanup_elapsed_us; + /* * Total number of planned and actually launched parallel workers for * index vacuuming and index cleanup. @@ -767,6 +773,9 @@ heap_vacuum_rel(Relation rel, const VacuumParams *params, vacrel->num_index_scans = 0; vacrel->num_dead_items_resets = 0; vacrel->total_dead_items_bytes = 0; + vacrel->heap_vacuum_elapsed_us = 0; + vacrel->index_vacuum_elapsed_us = 0; + vacrel->index_cleanup_elapsed_us = 0; vacrel->tuples_deleted = 0; vacrel->tuples_frozen = 0; vacrel->lpdead_items = 0; @@ -1285,6 +1294,7 @@ lazy_scan_heap(LVRelState *vacrel) BlockNumber orig_eager_scan_success_limit = vacrel->eager_scan_remaining_successes; /* for logging */ Buffer vmbuffer = InvalidBuffer; + instr_time now; const int initprog_index[] = { PROGRESS_VACUUM_PHASE, PROGRESS_VACUUM_TOTAL_HEAP_BLKS, @@ -1298,6 +1308,9 @@ lazy_scan_heap(LVRelState *vacrel) initprog_val[2] = vacrel->dead_items_info->max_bytes; pgstat_progress_update_multi_param(3, initprog_index, initprog_val); + /* Start timing the heap scan phase */ + INSTR_TIME_SET_CURRENT(vacrel->phase_start_time); + /* Initialize for the first heap_vac_scan_next_block() call */ vacrel->current_block = InvalidBlockNumber; vacrel->next_unskippable_block = InvalidBlockNumber; @@ -1367,6 +1380,15 @@ lazy_scan_heap(LVRelState *vacrel) vmbuffer = InvalidBuffer; } + /* + * Accumulate heap vacuum time before switching to index vacuum. + */ + INSTR_TIME_SET_CURRENT(now); + INSTR_TIME_SUBTRACT(now, vacrel->phase_start_time); + vacrel->heap_vacuum_elapsed_us += INSTR_TIME_GET_MICROSEC(now); + pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_VACUUM_TIME, + vacrel->heap_vacuum_elapsed_us); + /* Perform a round of index and heap vacuuming */ vacrel->consider_bypass_optimization = false; lazy_vacuum(vacrel); @@ -1383,6 +1405,9 @@ lazy_scan_heap(LVRelState *vacrel) /* Report that we are once again scanning the heap */ pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, PROGRESS_VACUUM_PHASE_SCAN_HEAP); + + /* Restart heap scan timing */ + INSTR_TIME_SET_CURRENT(vacrel->phase_start_time); } buf = read_stream_next_buffer(stream, &per_buffer_data); @@ -1596,6 +1621,15 @@ lazy_scan_heap(LVRelState *vacrel) read_stream_end(stream); + /* + * Accumulate final heap vacuum time before index vacuuming. + */ + INSTR_TIME_SET_CURRENT(now); + INSTR_TIME_SUBTRACT(now, vacrel->phase_start_time); + vacrel->heap_vacuum_elapsed_us += INSTR_TIME_GET_MICROSEC(now); + pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_VACUUM_TIME, + vacrel->heap_vacuum_elapsed_us); + /* * Do index vacuuming (call each index's ambulkdelete routine), then do * related heap vacuuming @@ -2506,6 +2540,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) }; int64 progress_start_val[2]; int64 progress_end_val[3]; + instr_time now; Assert(vacrel->nindexes > 0); Assert(vacrel->do_index_vacuuming); @@ -2526,6 +2561,9 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) progress_start_val[1] = vacrel->nindexes; pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val); + /* Start timing index vacuum phase */ + INSTR_TIME_SET_CURRENT(vacrel->phase_start_time); + if (!ParallelVacuumIsActive(vacrel)) { for (int idx = 0; idx < vacrel->nindexes; idx++) @@ -2582,6 +2620,14 @@ lazy_vacuum_all_indexes(LVRelState *vacrel) * We deliberately include the case where we started a round of bulk * deletes that we weren't able to finish due to the failsafe triggering. */ + + /* Accumulate index vacuum elapsed time */ + INSTR_TIME_SET_CURRENT(now); + INSTR_TIME_SUBTRACT(now, vacrel->phase_start_time); + vacrel->index_vacuum_elapsed_us += INSTR_TIME_GET_MICROSEC(now); + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_VACUUM_TIME, + vacrel->index_vacuum_elapsed_us); + vacrel->num_index_scans++; progress_end_val[0] = 0; progress_end_val[1] = 0; @@ -2644,6 +2690,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel) Buffer vmbuffer = InvalidBuffer; LVSavedErrInfo saved_err_info; TidStoreIter *iter; + instr_time now; Assert(vacrel->do_index_vacuuming); Assert(vacrel->do_index_cleanup); @@ -2653,6 +2700,9 @@ lazy_vacuum_heap_rel(LVRelState *vacrel) pgstat_progress_update_param(PROGRESS_VACUUM_PHASE, PROGRESS_VACUUM_PHASE_VACUUM_HEAP); + /* Start timing heap vacuum (second pass) phase */ + INSTR_TIME_SET_CURRENT(vacrel->phase_start_time); + /* Update error traceback information */ update_vacuum_error_info(vacrel, &saved_err_info, VACUUM_ERRCB_PHASE_VACUUM_HEAP, @@ -2742,6 +2792,13 @@ lazy_vacuum_heap_rel(LVRelState *vacrel) vacrel->relname, vacrel->dead_items_info->num_items, vacuumed_pages))); + /* Accumulate heap vacuum (second pass) elapsed time */ + INSTR_TIME_SET_CURRENT(now); + INSTR_TIME_SUBTRACT(now, vacrel->phase_start_time); + vacrel->heap_vacuum_elapsed_us += INSTR_TIME_GET_MICROSEC(now); + pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_VACUUM_TIME, + vacrel->heap_vacuum_elapsed_us); + /* Revert to the previous phase information for error traceback */ restore_vacuum_error_info(vacrel, &saved_err_info); } @@ -2955,6 +3012,7 @@ lazy_cleanup_all_indexes(LVRelState *vacrel) }; int64 progress_start_val[2]; int64 progress_end_val[2] = {0, 0}; + instr_time now; Assert(vacrel->do_index_cleanup); Assert(vacrel->nindexes > 0); @@ -2967,6 +3025,9 @@ lazy_cleanup_all_indexes(LVRelState *vacrel) progress_start_val[1] = vacrel->nindexes; pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val); + /* Start timing index cleanup phase */ + INSTR_TIME_SET_CURRENT(vacrel->phase_start_time); + if (!ParallelVacuumIsActive(vacrel)) { for (int idx = 0; idx < vacrel->nindexes; idx++) @@ -2992,6 +3053,13 @@ lazy_cleanup_all_indexes(LVRelState *vacrel) &(vacrel->worker_usage.cleanup)); } + /* Accumulate index cleanup elapsed time */ + INSTR_TIME_SET_CURRENT(now); + INSTR_TIME_SUBTRACT(now, vacrel->phase_start_time); + vacrel->index_cleanup_elapsed_us += INSTR_TIME_GET_MICROSEC(now); + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_CLEANUP_TIME, + vacrel->index_cleanup_elapsed_us); + /* Reset the progress counters */ pgstat_progress_update_multi_param(2, progress_end_index, progress_end_val); } diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 73a1c1c4670..50c827ae390 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1342,7 +1342,10 @@ CREATE VIEW pg_stat_progress_vacuum AS CASE S.param13 WHEN 1 THEN 'manual' WHEN 2 THEN 'autovacuum' WHEN 3 THEN 'autovacuum_wraparound' - ELSE NULL END AS started_by + ELSE NULL END AS started_by, + S.param14 / 1000000::double precision AS heap_vacuum_time, + S.param15 / 1000000::double precision AS index_vacuum_time, + S.param16 / 1000000::double precision AS index_cleanup_time FROM pg_stat_get_progress_info('VACUUM') AS S LEFT JOIN pg_database D ON S.datid = D.oid; diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h index 2a12920c75f..dc4abc11647 100644 --- a/src/include/commands/progress.h +++ b/src/include/commands/progress.h @@ -31,6 +31,9 @@ #define PROGRESS_VACUUM_DELAY_TIME 10 #define PROGRESS_VACUUM_MODE 11 #define PROGRESS_VACUUM_STARTED_BY 12 +#define PROGRESS_VACUUM_HEAP_VACUUM_TIME 13 +#define PROGRESS_VACUUM_INDEX_VACUUM_TIME 14 +#define PROGRESS_VACUUM_INDEX_CLEANUP_TIME 15 /* Phases of vacuum (as advertised via PROGRESS_VACUUM_PHASE) */ #define PROGRESS_VACUUM_PHASE_SCAN_HEAP 1 diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index a65a5bf0c4f..7242b929815 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -2202,7 +2202,10 @@ pg_stat_progress_vacuum| SELECT s.pid, WHEN 2 THEN 'autovacuum'::text WHEN 3 THEN 'autovacuum_wraparound'::text ELSE NULL::text - END AS started_by + END AS started_by, + ((s.param14)::double precision / (1000000)::double precision) AS heap_vacuum_time, + ((s.param15)::double precision / (1000000)::double precision) AS index_vacuum_time, + ((s.param16)::double precision / (1000000)::double precision) AS index_cleanup_time FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20) LEFT JOIN pg_database d ON ((s.datid = d.oid))); pg_stat_recovery| SELECT promote_triggered, -- 2.47.3
estimate_vacuum_completion.sql
Description: application/sql
