On Thu, Jan 18, 2024 at 11:17 AM Peter Geoghegan <p...@bowt.ie> wrote: > True. But the way that PageGetHeapFreeSpace() returns 0 for a page > with 291 LP_DEAD stubs is a much older behavior. When that happens it > is literally true that the page has lots of free space. And yet it's > not free space we can actually use. Not until those LP_DEAD items are > marked LP_UNUSED.
To me, this is just accurate reporting. What we care about in this context is the amount of free space on the page that can be used to store a new tuple. When there are no line pointers available to be allocated, that amount is 0. > Another big source of inaccuracies here is that we don't credit > RECENTLY_DEAD tuple space with being free space. Maybe that isn't a > huge problem, but it makes it even harder to believe that precision in > FSM accounting is an intrinsic good. The difficulty here is that we don't know how long it will be before that space can be reused. Those recently dead tuples could become dead within a few milliseconds or stick around for hours. I've wondered about the merits of some FSM that had built-in visibility awareness, i.e. the capability to record something like "page X currently has Y space free and after XID Z is all-visible it will have Y' space free". That seems complex, but without it, we either have to bet that the space will actually become free before anyone tries to use it, or that it won't. If whatever guess we make is wrong, bad things happen. > My remarks about "FSM_CATEGORIES-wise precision" were basically > remarks about the fundamental problem with the free space map. Which > is really that it's just a map of free space, that gives exactly zero > thought to various high level things that *obviously* matter. I wasn't > particularly planning on getting into the specifics of that with you > now, on this thread. Fair. > A brief recap might be useful: other systems with a heap table AM free > space management structure typically represent the free space > available on each page using a far more coarse grained counter. > Usually one with less than 10 distinct increments. The immediate > problem with FSM_CATEGORIES having such a fine granularity is that it > increases contention/competition among backends that need to find some > free space for a new tuple. They'll all diligently try to find the > page with the least free space that still satisfies their immediate > needs -- there is no thought for the second-order effects, which are > really important in practice. I think that the completely deterministic nature of the computation is a mistake regardless of anything else. That serves to focus contention rather than spreading it out, which is dumb, and would still be dumb with any other number of FSM_CATEGORIES. > What I really wanted to convey is this: if you're going to go the > route of ignoring LP_DEAD free space during vacuuming, you're > conceding that having a high degree of precision about available free > space isn't actually useful (or wouldn't be useful if it was actually > possible at all). Which is something that I generally agree with. I'd > just like it to be clear that you/Melanie are in fact taking one small > step in that direction. We don't need to discuss possible later steps > beyond that first step. Not right now. Yeah. I'm not sure we're actually going to change that right now, but I agree with the high-level point regardless, which I would summarize like this: The current system provides more precision about available free space than we actually need, while failing to provide some other things that we really do need. We need not agree today on exactly what those other things are or how best to get them in order to agree that the current system has significant flaws, and we do agree that it does. -- Robert Haas EDB: http://www.enterprisedb.com