Hi! I've found this discussion very interesting, in view of vacuuming TOAST tables is always a problem because these tables tend to bloat very quickly with dead data - just to remind, all TOAST-able columns of the relation use the same TOAST table which is one for the relation, and TOASTed data are not updated - there are only insert and delete operations.
Have you tested it with large and constantly used TOAST tables? How would it work with the current TOAST implementation? We propose a different approach to the TOAST mechanics [1], and a new vacuum would be very promising. Thank you! [1] https://commitfest.postgresql.org/41/3490/ On Fri, Dec 16, 2022 at 10:48 AM John Naylor <john.nay...@enterprisedb.com> wrote: > > On Wed, Dec 14, 2022 at 6:07 AM Peter Geoghegan <p...@bowt.ie> wrote: > > > > At the suggestion of Jeff, I wrote a Wiki page that shows motivating > > examples for the patch series: > > > > > https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples > > > > These are all cases where VACUUM currently doesn't do the right thing > > around freezing, in a way that is greatly ameliorated by the patch. > > Perhaps this will help other hackers to understand the motivation > > behind some of these mechanisms. There are plenty of details that only > > make sense in the context of a certain kind of table, with certain > > performance characteristics that the design is sensitive to, and seeks > > to take advantage of in one way or another. > > Thanks for this. This is the kind of concrete, data-based evidence that I > find much more convincing, or at least easy to reason about. I'd actually > recommend in the future to open discussion with this kind of analysis -- > even before coding, it's possible to indicate what a design is *intended* > to achieve. And reviewers can likewise bring up cases of their own in a > concrete fashion. > > On Wed, Dec 14, 2022 at 12:16 AM Peter Geoghegan <p...@bowt.ie> wrote: > > > At the very least, a given VACUUM operation has to choose its freezing > > strategy based on how it expects the table will look when it's done > > vacuuming the table, and how that will impact the next VACUUM against > > the same table. Without that, then vacuuming an append-only table will > > fall into a pattern of setting pages all-visible in one vacuum, and > > then freezing those same pages all-frozen in the very next vacuum > > because there are too many. Which makes little sense; we're far better > > off freezing the pages at the earliest opportunity instead. > > That makes sense, but I wonder if we can actually be more specific: One > motivating example mentioned is the append-only table. If we detected that > case, which I assume we can because autovacuum_vacuum_insert_* GUCs exist, > we could use that information as one way to drive eager freezing > independently of size. At least in theory -- it's very possible size will > be a necessary part of the decision, but it's less clear that it's as > useful as a user-tunable knob. > > If we then ignored the append-only case when evaluating a freezing policy, > maybe other ideas will fall out. I don't have a well-thought out idea about > policy or knobs, but it's worth thinking about. > > Aside from that, I've only given the patches a brief reading. Having seen > the VM snapshot in practice (under "Scanned pages, visibility map snapshot" > in the wiki page), it's neat to see fewer pages being scanned. Prefetching > not only seems superior to SKIP_PAGES_THRESHOLD, but anticipates > asynchronous IO. Keeping only one VM snapshot page in memory makes perfect > sense. > > I do have a cosmetic, but broad-reaching, nitpick about terms regarding > "skipping strategy". That's phrased as a kind of negative -- what we're > *not* doing. Many times I had to pause and compute in my head what we're > *doing*, i.e. the "scanning strategy". For example, I wonder if the VM > strategies would be easier to read as: > > VMSNAP_SKIP_ALL_VISIBLE -> VMSNAP_SCAN_LAZY > VMSNAP_SKIP_ALL_FROZEN -> VMSNAP_SCAN_EAGER > VMSNAP_SKIP_NONE -> VMSNAP_SCAN_ALL > > Notice here they're listed in order of increasing eagerness. > > -- > John Naylor > EDB: http://www.enterprisedb.com > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/