On Tue, Oct 4, 2022 at 7:59 PM Jeff Davis <pg...@j-davis.com> wrote: > I am fine with that, but I'd like us all to understand what the > downsides are.
Although I'm sure that there must be one case that loses measurably, it's not particularly obvious where to start looking for one. I mean it's easy to imagine individual pages that we lose on, but a practical test case where most of the pages are like that reliably is harder to imagine. > If I understand correctly: > > 1. Eager freezing (meaning to freeze at the same time as setting all- > visible) causes a modest amount of WAL traffic, hopefully before the > next checkpoint so we can avoid FPIs. Lazy freezing (meaning set all- > visible but don't freeze) defers the work, and it might never need to > be done; but if it does, it can cause spikes at unfortunate times and > is more likely to generate more FPIs. Lazy freezing means to freeze every eligible tuple (every XID < OldestXmin) when one or more XIDs are before FreezeLimit. Eager freezing means freezing every eligible tuple when the page is about to be set all-visible, or whenever lazy freezing would trigger freezing. Eager freezing tends to avoid big spikes in larger tables, which is very important. It can sometimes be cheaper and better in every way than lazy freezing. Though lazy freezing sometimes retains an advantage by avoiding freezing that is never going to be needed altogether, typically only in small tables. Lazy freezing is fairly similar to what we do on HEAD now -- though it's not identical. It's still "page level freezing". It has lazy criteria for triggering page freezing. > 2. You're trying to mitigate the downsides of eager freezing by: > a. when freezing a tuple, eagerly freeze other tuples on that page > b. optimize WAL freeze records Sort of. Both of these techniques apply to eager freezing too, in fact. It's just that eager freezing is likely to do the bulk of all freezing that actually goes ahead. It'll disproportionately be helped by these techniques because it'll do most actual freezing that goes ahead (even when most VACUUM operations use the lazy freezing strategy, which is probably the common case -- just because lazy freezing freezes lazily). > 3. You're trying to capture the trade-off in #1 by using the table size > as a proxy. Deferred work is only really a problem for big tables, so > that's where you use eager freezing. Right. > But maybe we can just always use > eager freezing?: That doesn't seem like a bad idea, though it might be tricky to put into practice. It might be possible to totally unite the concept of all-visible and all-frozen pages in the scope of this work. But there are surprisingly many tricky details involved. I'm not surprised that you're suggesting this -- it basically makes sense to me. It's just the practicalities that I worry about here. > a. You're mitigating the WAL work for freezing. I don't see why this would be true. Lazy vs Eager are exactly the same for a given page at the point that freezing is triggered. We'll freeze all eligible tuples (often though not always every tuple), or none at all. Lazy vs Eager describe the policy for deciding to freeze a page, but do not affect the actual execution steps taken once we decide to freeze. > b. A lot of people run with checksums on, meaning that setting the > all-visible bit requires WAL work anyway, and often FPIs. The idea of rolling the WAL records into one does seem appealing, but we'd still need the original WAL record to set a page all-visible in VACUUM's second heap pass (only setting a page all-visible in the first heap pass could be optimized by making the FREEZE_PAGE WAL record mark the page all-visible too). Or maybe we'd roll that into the VACUUM WAL record at the same time. In any case the second heap pass would have to have a totally different WAL logging strategy to the first heap pass. Not insurmountable, but not exactly an easy thing to do in passing either. > c. All-visible is conceptually similar to freezing, but less > important, and it feels more and more like the design concept of all- > visible isn't carrying its weight. Well, not quite -- at least not on the VM side itself. There are cases where heap_lock_tuple() will update a tuple's xmax, replacing it with a new Multi. This will necessitate clearly the page's all-frozen bit in the VM -- but the all-visible bit will stay set. This is why it's possible for small numbers of all-visible pages to appear even in large tables that have been eagerly frozen. > d. (tangent) I had an old patch[1] that actually removed > PD_ALL_VISIBLE (the page bit, not the VM bit), which was rejected, but > perhaps its time has come? I remember that pgCon developer meeting well. :-) If anything your original argument for getting rid of PD_ALL_VISIBLE is weakened by the proposal to merge together the WAL records for freezing and for setting a heap page all visible. You'd know for sure that the page will be dirtied when such a WAL record needed to be written, so there is actually no reason to care about dirtying the page. No? I'm in favor of reducing the number of WAL records required in common cases if at all possible -- purely because the generic WAL record overhead of having an extra WAL record does probably add to the WAL overhead for work performed in lazy_scan_prune(). But it seems like separate work to me. -- Peter Geoghegan