On Thu, Mar 24, 2022 at 3:28 PM Peter Geoghegan <p...@bowt.ie> wrote: > But non-aggressive VACUUMs have always been able to do that. > > How about: "Set relfrozenxid to oldest extant XID seen by VACUUM"
Sure, that sounds nice. > Believe it or not, I avoided functional changes in 0002 -- at least in > one important sense. That's why you had difficulty spotting any. This > must sound peculiar, since the commit message very clearly says that > the commit avoids a problem seen only in the non-aggressive case. It's > really quite subtle. Well, I think the goal in revising the code is to be as un-subtle as possible. Commits that people can't easily understand breed future bugs. > What you're saying here boils down to this: it doesn't matter what the > visibility map would say right this microsecond (in the aggressive > case) were we to call VM_ALL_FROZEN(): we know for sure that the VM > said that this page was all-frozen *in the recent past*. That's good > enough; we will never fail to scan a page that might have an XID < > OldestXmin (ditto for XMIDs) this way, which is all that really > matters. Makes sense. So maybe the commit message should try to emphasize this point e.g. "If a page is all-frozen at the time we check whether it can be skipped, don't allow it to affect the relfrozenxmin and relminmxid which we set for the relation. This was previously true for aggressive vacuums, but not for non-aggressive vacuums, which was inconsistent. (The reason this is a safe thing to do is that any new XIDs or MXIDs that appear on the page after we initially observe it to be frozen must be newer than any relfrozenxid or relminmxid the current vacuum could possibly consider storing into pg_class.)" > This is absolutely mandatory in the aggressive case, because otherwise > relfrozenxid advancement might be seen as unsafe. My observation is: > Why should we accept the same race in the non-aggressive case? Why not > do essentially the same thing in every VACUUM? Sure, that seems like a good idea. I think I basically agree with the goals of the patch. My concern is just about making the changes understandable to future readers. This area is notoriously subtle, and people are going to introduce more bugs even if the comments and code organization are fantastic. > And so you could almost say that there is now behavioral change at > all. I vigorously object to this part, though. We should always err on the side of saying that commits *do* have behavioral changes. We should go out of our way to call out in the commit message any possible way that someone might notice the difference between the post-commit situation and the pre-commit situation. It is fine, even good, to also be clear about how we're maintaining continuity and why we don't think it's a problem, but the only commits that should be described as not having any behavioral change are ones that do mechanical code movement, or are just changing comments, or something like that. > It seems kinda tricky to split up 0002 like that. It's possible, but > I'm not sure if it's possible to split it in a way that highlights the > issue that I just described. Because we already avoided the race in > the aggressive case. I do see that there are some difficulties there. I'm not sure what to do about that. I think a sufficiently clear commit message could possibly be enough, rather than trying to split the patch. But I also think splitting the patch should be considered, if that can reasonably be done. -- Robert Haas EDB: http://www.enterprisedb.com