Attached is a worked-out patch for the approach proposed here: http://archives.postgresql.org/pgsql-hackers/2008-06/msg00777.php namely, that cache management for de-TOASTed datums is handled by TupleTableSlots.
To avoid premature detoasting of values that we might never need, the patch introduces a concept of an "indirect TOAST pointer", which has the same 0x80 or 0x01 header as an external TOAST pointer, but can be told apart by having a different length byte. Within that we have * pointer to original toasted field within the Slot's tuple * pointer to the owning Slot * pointer to decompressed copy, or NULL if not decompressed yet Some fairly straightforward extensions to the TupleTableSlot code, heaptuple.c, and tuptoaster.c make it all go. My original thoughts had included turning FREE_IF_COPY() into a no-op, but on investigation that seems impractical. One case that still depends on that pfree is where we have palloc'd a 4-byte-header copy of a short-header datum to support code that needs properly aligned datum content. The solution adopted in the patch is to arrange for pfree() applied to a cacheable detoasted object to be a no-op, whereas it still works normally for non-cached detoasted objects. We do this by inserting a dummy chunk header that points to a dummy memory context whose pfree support method does nothing. I think this part of the patch would be required for any toast caching method, not just this one. What I like about this patch is that it's a fairly small-footprint change, it doesn't add much overhead, and it covers caching of decompression for in-line-compressed datums as well as the out-of-line case. One thing I really *don't* like about it is that it requires everyplace that copies Datums to know about indirect pointers: in general, the copy must be a copy of the original toasted Datum, not of the indirect pointer, else we have indirect pointers that can outlive their owning TupleTableSlot (or at least outlive its current tuple cycle). There only seem to be about half a dozen such places in the current code, but still it seems a rather fragile coding rule. After playing with it for a little bit, I'm not convinced that it buys enough performance win to be worth applying --- the restriction of cache lifespan to one tuple cycle of a TupleTableSlot is awfully restrictive. (For example, sorts that involve toasted sort keys continue to suck, because the tuples being sorted aren't in Slots.) It would probably fix the specific case that the PostGIS hackers were complaining of, but I think we need something more. Still, I wanted to get it into the archives because the idea of indirect toast pointers might be useful for something else. regards, tom lane
bin3ziOy60Bud.bin
Description: toast-indirect-1.patch.gz
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers