On Thu, Oct 20, 2022 at 11:09 AM Jeff Davis <pg...@j-davis.com> wrote: > The terminology is getting slightly confusing here: by > "antiwraparound", you mean that it's not skipping unfrozen pages, and > therefore is able to advance relfrozenxid. Whereas the > PROC_VACUUM_FOR_WRAPAROUND is the same thing, except done with greater > urgency because wraparound is imminent. Right?
Not really. I started this thread to discuss a behavior in autovacuum.c and proc.c (the autocancellation behavior), which is, strictly speaking, not related to the current vacuumlazy.c behavior we call aggressive mode VACUUM. Various hackers have in the past described antiwraparound autovacuum as "implying aggressive", which makes sense; what's the point in doing an antiwraparound autovacuum that can almost never advance relfrozenxid? It is nevertheless true that antiwraparound autovacuum is an independent behavior to aggressive VACUUM. The former is an autovacuum thing, and the latter is a VACUUM thing. That's just how it works, mechanically. If this division seems artificial or pedantic to you, then consider the fact that you can quite easily get a non-aggressive antiwraparound autovacuum by using the storage option called autovacuum_freeze_max_age (instead of the GUC): https://postgr.es/m/CAH2-Wz=DJAokY_GhKJchgpa8k9t_H_OVOvfPEn97jGNr9W=d...@mail.gmail.com This is even a case where we'll output a distinct description in the server log when autovacuum logging is enabled and gets triggered. So while there may be no point in an antiwraparound autovacuum that is non-aggressive, that doesn't stop them from happening. Regardless of whether or not that's an intended behavior, that's just how the mechanism has been constructed. > > There is no inherent reason why we have to do both > > things at exactly the same XID-age-wise time. But there is reason to > > think that doing so could make matters worse rather than better [1]. > > Can you explain? Why should the special autocancellation behavior for antiwraparound autovacuums kick in at exactly the same point that we first launch an antiwraparound autovacuum? Maybe that aggressive intervention will be needed, in the end, but why start there? With my patch series, antiwraparound autovacuums still occur, but they're confined to things like static tables -- things that are pretty much edge cases. They still need to behave sensibly (i.e. reliably advance relfrozenxid based on some principled approach), but now they're more like "an autovacuum that happens because no other condition triggered an autovacuum". To some degree this is already the case, but I'd like to be more deliberate about it. Leaving my patch series aside, I still don't think that it makes sense to make it impossible to auto-cancel antiwraparound autovacuums, across the board, regardless of the underlying table age. We still need something like that, but why not give a still-cancellable autovacuum worker a chance to resolve the problem? Why take a risk of causing much bigger problems (e.g., blocking automated DDL that blocks simple SELECT queries) before the point that that starts to look like the lesser risk (compared to hitting xidStopLimit)? -- Peter Geoghegan