On Thu, Sep 28, 2023 at 12:03 AM Peter Geoghegan <p...@bowt.ie> wrote: > But isn't the main problem *not* freezing when we could and > should have? (Of course the cost of freezing is very relevant, but > it's still secondary.)
Perhaps this is all in how you look at it, but I don't see it this way. It's easy to see how to solve the "not freezing" problem: just freeze everything as often as possible. If that were cheap enough that we could just do it, then we'd just do it and be done here. The problem is that, at least in my opinion, that seems too expensive in some cases. I'm starting to believe that those cases are narrower than I once thought, but I think they do exist. So now, I'm thinking that maybe the main problem is identifying when you've got such a case, so that you know when you need to be less aggressive. > Won't the algorithm that you've sketched always think that > "unfreezing" pages doesn't affect recently frozen pages with such a > workload? Isn't the definition of "recently frozen" that emerges from > this algorithm not in any way related to the order delivery time, or > anything like that? You know, rather like vacuum_freeze_min_age. FWIW, I agree that vacuum_freeze_min_age sucks. I have been reluctant to endorse changes in this area mostly because I fear replacing one bad idea with another, not because I think that what we have now is particularly good. It's better to be wrong in the same way in every release than to have every release be equally wrong but in a different way. Also, I think the question of what "recently frozen" means is a good one, but I'm not convinced that it ought to relate to the order delivery time. If we insert into a table and 12-14 hours go buy before it's updated, it doesn't seem particularly bad to me if we froze that data meanwhile (regardless of what metric drove that freezing). Same thing if it's 2-4 hours. What seems bad to me is if we're constantly updating the table and vacuum comes sweeping through and freezing everything to no purpose over and over again and then it gets un-frozen a few seconds or minutes later. Now maybe that's the wrong idea. After all, as a percentage, the overhead is the same either way, regardless of whether we're talking about WAL volume or CPU cycles. But somehow it feels worse to make the same mistakes every few minutes or potentially even tens of seconds than it does to make the same mistakes every few hours. The absolute cost is a lot higher. > On a positive note, I like that what you've laid out freezes eagerly > when an FPI won't result -- this much we can all agree on. I guess > that that part is becoming uncontroversial. I don't think that we're going to be able to get away with freezing rows in a small, frequently-updated table just because no FPI will result. I think Melanie's results show that the cost is not negligible. But Andres's pseudocode algorithm, although it is more aggressive in that case, doesn't necessarily seem bad to me, because it still has some ability to hold off freezing in such cases if our statistics show that it isn't working out. -- Robert Haas EDB: http://www.enterprisedb.com