Hello hackers, I'd like to discuss an issue I observed where TOAST tables do not properly inherit `vacuum_truncate` settings from their parent tables during autovacuum operations, leading to unexpected and prolonged truncation phases that can cause significant lock contention on production systems. I discussed a nature of this issue on a related topic here [1].
Based on the documentation [2], my understanding is that TOAST tables should inherit the `vacuum_truncate` setting from their parent tables. However, I've observed that this is not the case. > If a table parameter value is set and the equivalent toast. parameter is not, the TOAST table will use the table's parameter value. [2] Recap of the production incident We experienced this issue directly where an autovacuum on a TOAST table took approximately 5 hours despite the parent table having `vacuum_truncate=false` explicitly set. One important note here is that I added the vacuum_truncate=false to the parent table _after_ it was created via `ALTER TABLE ... SET (vacuum_truncate = false)`. During the autovacuum, the TOAST table performed truncation anyway which lead to locks on the reader and impacting queries basically. Example: ``` automatic vacuum of table "tines.pg_toast.pg_toast_27098": index scans: 1 pages: 12730516 removed, 421195 remain, 0 skipped due to pins, 12720236 skipped frozen tuples: 4575083 removed, 26717710 remain, 21498 are dead but not yet removable removable cutoff: 173810717, which was 5 XIDs old when operation ended new relfrozenxid: 173810722, which is 5 XIDs ahead of previous value frozen: 150 pages from table (0.04% of total) had 26717710 tuples frozen visibility map: 421045 pages set all-visible, 421045 pages set all-frozen index scan needed: 151751 pages from table (36.06% of total) had 4629490 dead item identifiers removed index "pg_toast_27098_index": pages: 303420 in total, 14038 newly deleted, 298668 currently deleted, 298418 reusable avg read rate: 8.730 MB/s, avg write rate: 2.140 MB/s buffer usage: 5132686 hits, 25714763 misses, 2140873 dirtied WAL usage: 2845123 records, 1285947 full page images, 458923847 bytes system usage: CPU: user: 327.79 s, system: 251.11 s, elapsed: 18722.21 s ``` After investigating the code, I discovered that TOAST tables do not properly inherit vacuum_truncate settings from their parent tables during autovacuum operations. The issue manifests in two scenarios per my understanding and testing: 1. Manual VACUUM: The toast_vacuum_params structure is copied early in vacuum_rel() before the main table processes its vacuum options, so TOAST tables receive the default VACOPT_TERNARY_DEFAULT value rather than the parent's final decision. 2. Autovacuum: TOAST tables are treated as completely separate vacuum operations in the autovacuum launcher, so they never consult their parent table's settings. They only use explicit toast.vacuum_truncate settings if present on the parent table, but when no such explicit TOAST-specific setting exists, they fall back to the global vacuum_truncate default rather than inheriting the parent table's vacuum_truncate setting. My understanding of how the inheritance should work (based on the documentation) is that - if a parent table has toast.vacuum_truncate explicitly set, that value is used correctly for the TOAST table. However, when no toast.vacuum_truncate is specified, the TOAST table should inherit the parent's vacuum_truncate setting. Currently, this second case fails during autovacuum operations, causing TOAST tables to use the global default instead of the parent's setting. Which then results in the lock contention issue that comes with truncation that I mentioned above. To validate this understanding, I added some debug logging to trace vacuum truncate decisions and was able to reproduce the issue consistently. Here's what the logs showed before the fix using the script I attached where I set the parent table to have `vacuum_truncate=false` and create & delete some dummy data while having the autovacuum trigger frequently: ``` LOG: VACUUM TRUNCATE DEBUG: relation "toast_truncate_test" has explicit vacuum_truncate=false LOG: VACUUM TRUNCATE DEBUG: relation "toast_truncate_test" final decision: params.truncate=DISABLED automatic vacuum: pages: 0 removed, 671 remain LOG: VACUUM TRUNCATE DEBUG: relation "pg_toast_327681" using global default vacuum_truncate=true LOG: VACUUM TRUNCATE DEBUG: relation "pg_toast_327681" final decision: params.truncate=ENABLED automatic vacuum: pages: 92963 removed, 139458 remain ``` To see if this is the case, I implemented a very POC(proof-of-concept)-level patch that would address both scenarios: 1. For manual VACUUM: Modified `vacuum_rel()` to pass the final truncation decision to TOAST tables after the main table processes its options. 2. For autovacuum: Added logic in the vacuum truncate decision code to detect when we're processing a TOAST table and look up the parent table's `vacuum_truncate` setting when the TOAST table itself has no explicit setting. The key changes are in `src/backend/commands/vacuum.c`: - Modified parameter passing to allow updating vacuum params after initial setup - Added parent table lookup logic using `pg_class.reltoastrelid` - Ensured TOAST tables inherit parent settings when they have `VACOPT_TERNARY_DEFAULT` After applying the fix, the logs show correct inheritance: ``` LOG: VACUUM TRUNCATE DEBUG: relation "toast_truncate_test" final decision: params.truncate=DISABLED automatic vacuum: pages: 0 removed, 671 remain LOG: VACUUM TRUNCATE DEBUG: relation "pg_toast_327681" inheriting vacuum_truncate=false from parent OID 327681 LOG: VACUUM TRUNCATE DEBUG: relation "pg_toast_327681" final decision: params.truncate=DISABLED automatic vacuum: pages: 0 removed, 139458 remain ``` The patch isn't quite production ready I would say and I mostly used it as a way to learn more about the codebase and the internals of the autovacuum process. I have attached the patch and the script I used to test it. What I am curious about is - does this report of the bug make sense and if the patch is headed in the right direction? Or is the issue here with the documentation and its actually intended behavior that autovacuum should not inherit the parent table's vacuum_truncate setting when working on TOAST tables? My preference would be that it works the way described in the documentation. And if that approach is acceptable, I'm particularly interested in feedback on whether the parent table lookup logic is appropriate for this use case, and if there are any edge cases I should consider. I haven't yet paid close attention to the patch around opening/closing tables and indexes, so that will be my next step if the approach seems reasonable. Happy to iterate as we go. Thanks for your time and consideration. Shayon [1] https://www.postgresql.org/message-id/CANqtF-qDGYhYDcpg3PEeDrXMmuJZJGTAeT0mJx0KrN%2BkVikZig%40mail.gmail.com [2] https://www.postgresql.org/docs/current/sql-createtable.html#RELOPTION-VACUUM-TRUNCATE
v1-0001-Vacuum-truncate-inheritance-support.patch
Description: Binary data
toast_autovacuum_test_fixed.sql
Description: Binary data