On Thu, Mar 19, 2026, at 11:49 AM, Nathan Bossart wrote: > On Thu, Mar 19, 2026 at 09:49:34AM -0400, Greg Burd wrote: >> My concern isn't that wraparound vacuums are inherently alarming, I agree >> with you that reaching freeze_max_age isn't a crisis. The issue is a >> scoring-scale problem in the gap between freeze_max_age (200M) and >> failsafe age (1.6B). >> >> In that 1.4B XID window, force_vacuum tables have XID scores of 1.0–8.0 >> (age/freeze_max_age), while typical active tables accumulate dead-tuple >> scores of 18–70+ within hours of their last vacuum. The exponential boost >> doesn't activate until failsafe age, so force_vacuum tables are >> systematically outranked by routine bloat cleanup for what could be days >> or weeks in production. > > I think "systematically outranked" makes the problem sound worse than it > is. Once the freeze age is reached, the table is going to get added to the > list no matter what, it just might be sorted lower.
Yeah, that was a bit of hyperbole on my part. :) >>> Having said that, I'd not realised that Nathan capped the new GUCs at >>> 1.0. I think we should allow those to be set higher, likely at least >>> to 10.0. >> >> That would definitely help. If autovacuum_freeze_score_weight could be >> set to 8.0–10.0, DBAs could manually restore the priority we want. > > Done in the attached. +1 >>> Maybe we could consider adjusting the code that's setting the >>> xid_score/mxid_score so that we start scaling the score aggressively >>> when if (xid_age >= effective_xid_failsafe_age / >>> Max(autovacuum_freeze_score_weight,1.0)) becomes true >> >> This is clever, it would make the aggressive scaling kick in earlier when >> the weight is higher. At weight=8.0, you'd get exponential boost starting >> at 200M (failsafe/8) instead of 1.6B. > > Seems reasonable. I've added this, too. +1 > Something else we might want to > consider is scaling the score once the freeze age is reached, just much > less aggressively than we do at the failsafe age. It probably doesn't make > sense to start scaling too much at 200M, but at 1.5B, yeah, we should > probably process the table sooner than later. So a scaling factor relative to some point like 200M? Maybe... but for now I think what you have in v13 is about right and a solid improvement over what's there now. > -- > nathan > > Attachments: > * v13-0001-autovacuum-scheduling-improvements.patch LGTM! best. -greg
