On Tue, Nov 25, 2025 at 11:54 PM Michael Paquier <[email protected]> wrote: > > Tom, you are registered as a reviewer of the patch. The point of > > contention of the patch, where I see there is no consensus yet, is if > > my approach of using a redirection for the external TOAST pointers > > with a new layer to facilitate the addition of more vartags (aka the > > 64b value vartag proposed here, concept that could also apply to > > compression methods later on) is acceptable. Moving to a different > > approach, like the "brutal" one I am naming upthread where the > > redirection layer is replaced by changes in all the code paths that > > need to be touched, would be of course cheaper at runtime as there > > would be no more redirection, but the maintenance would be a nightmare > > the more vartags we add, and I have some plans for more of these. > > Doing the switch would be a few hours work, so that would not be a big > > deal, I guess. The important part is an agreement about the approach, > > IMO. > > This point still got no reply. It would be nice to do something for > this release regarding this old issue, IMO..
I took a brief look at this today, looking at parts of v8-0005 and v8-0015. Although I don't dislike the idea of an abstraction layer in concept, it's unclear to me how much this particular abstraction layer is really buying you. It's basically abstracting away two things: the difference between the current varatt_external and varatt_external_oid8, and the unbundling of compression_method from extsize. That's not a lot. This thread has a bunch of ideas already on other things that people want to do to the TOAST system or think you should be doing to the TOAST system, and while I'm -1 on letting those discussions hijack the thread, it's worth paying attention to whether the abstraction layer makes any of them easier. As far as I can see, it doesn't. I suspect, for example, that direct TID access to toast values is a much worse idea than people here seem to be supposing, but whether that's true or false, toast_external_data doesn't help anyone who is trying to implement it. I don't think it even really helps with zstd compression, even though that's going to need a ToastCompressionId, so I kind of find myself wondering what the point is. One alternative idea that I had is just provide a way to turn a 4-byte TOAST pointe into an 8-byte TOAST pointer. If you inserted that surgically at certain places, maybe you could make 4-byte TOAST pointers invisible to certain parts of the system. But I'm also not sure that's any better than what you call the "brutal" approach, which I'm actually not sure is that brutal. I mean, how bad is it if we just deal with one more possibility at each place where we currently deal with VARTAG_ONDISK? To be clear, I am not trying to say "absolutely don't do this abstraction layer". However, imagine a future where we have 10 vartags that can appear on disk: the current VARTAG_ONDISK and 9 others. If, when that time comes, your abstraction layer handles 5 or 6 or more of those, I'd call that a win. If the only two it ever handles are OID and OID8, I'd say that's a loss: it's not really an abstraction layer at all at that point. In that situation you'd rather just admit that what you're really trying to do is smooth over the distinction between OID and OID8 and keep any other goals (including unbundling the compression ID, IMHO) separate. I am somewhat bothered by the idea of just bluntly changing everything over to 8-byte TOAST OIDs. Widening the TOAST table column doesn't concern me; it eats up 4 bytes, but the tuples are ~2k so it's not that much of a difference. Widening the TOAST pointer seems like a bigger concern. Every single toasted value is getting 4 bytes wider. A single tuple could potentially have a significant number of such columns, and as a result, get significantly wider. We could still use the 4-byte toast pointer format if the value ID happens to be small, but if we just assign the value ID using a 4-byte counter, after the first 4B allocations, it will never be small again. After that, relations that never would have needed anything like 4B toast pointers will still pay the cost for those that do, which seems quite sad. Whether this patch should be responsible for doing something about that sadness is unclear to me: a sequence-per-relation to allow separate counter allocation for each relation would be great, but bottlenecking this patch set behind such a large change might well be the wrong idea. I am also uncertain how much loss we're really talking about: if for example a wide table contains 5 long text blogs per row (which seems fairly high) then that's "only" 20 bytes/row, and probably those rows don't fit into 8kB blocks that well anyway, so maybe the loss of efficiency isn't much. Maybe the biggest concern is that, IIUC, some tuples that currently can be stored on disk wouldn't be storable at all any more with the wider pointers, because the row couldn't be squeezed into the block no matter how much pressure the TOAST code applies. If that's correct, I think there's a pretty good chance that this patch will make a few people who are already desperately unhappy with the TOAST system even more unhappy. There's nothing you can really do if you're up against that hard limit. I'm not sure what, if anything, we can or should do about that, but it seems like something we ought to at least discuss. -- Robert Haas EDB: http://www.enterprisedb.com
