On Thu, Dec 15, 2022 at 11:59 PM Nikita Malakhov <huku...@gmail.com> wrote: > I've found this discussion very interesting, in view of vacuuming > TOAST tables is always a problem because these tables tend to > bloat very quickly with dead data - just to remind, all TOAST-able > columns of the relation use the same TOAST table which is one > for the relation, and TOASTed data are not updated - there are > only insert and delete operations.
I don't think that it would be any different to any other table that happened to have lots of inserts and deletes, such as the table described here: https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples#Mixed_inserts_and_deletes In the real world, a table like this would probably consist of some completely static data, combined with other data that is constantly deleted and re-inserted -- probably only a small fraction of the table at any one time. I would expect such a table to work quite well, because the static pages would all become frozen (at least after a while), leaving behind only the tuples that are deleted quickly, most of the time. VACUUM would have a decent chance of noticing that it will be cheap to advance relfrozenxid in earlier VACUUM operations, as bloat is cleaned up -- even a VACUUM that happens long before the point that autovacuum.c will launch an antiwraparound autovacuum has a decent chance of it. That's not a new idea, really; the pgbench_branches example from the Wiki page looks like that already, and even works on Postgres 15. Here is the part that's new: the pressure to advance relfrozenxid grows gradually, as table age grows. If table age is still very young, then we'll only do it if the number of "extra" scanned pages is < 5% of rel_pages -- only when the added cost is very low (again, like the pgbench_branches example, mostly). Once table age gets about halfway towards the point that antiwraparound autovacuuming is required, VACUUM then starts caring less about costs. It gradually worries less about the costs, and more about the need to advance it. Ideally it will happen before antiwraparound autovacuum is actually required. I'm not sure how much this would help with bloat. I suspect that it could make a big difference with the right workload. If you always need frequent autovacuums, just to deal with bloat, then there is never a good time to run an aggressive antiwraparound autovacuum. An aggressive AV will probably end up taking much longer than the typical autovacuum that deals with bloat. While the aggressive AV will remove as much bloat as any other AV, in theory, that might not help much. If the aggressive AV takes as long as (say) 5 regular autovacuums would have taken, and if you really needed those 5 separate autovacuums to run, just to deal with the bloat, then that's a real problem. The aggressive AV effectively causes bloat with such a workload. -- Peter Geoghegan