On 21/04/2026 08:41, Evgeny Voropaev wrote:
Hello hackers,

Can this DFoR code replace integerset.c easily? Can we use it for
the vacuum dead TID list? For GIN posting lists? Where else?

Heikki, thank you for your attention and proposals. I'm learning areas
you proposed to be developed. This took time, since I am not adept at
them. Last week I also have been developing the DFoR patch to support
unsorted sequences. That's why there was the delay in answering.

About GIN.
Since GIN exploits TIDs sequences and saves it on the disk, it can be
the most appropriate candidate to be developed with DFoR.

+1. And maybe the tid lists in to B-tree tuples too while we're at it.

For GIN posting lists, one important property of the current compression scheme is that removing TIDs never makes the list larger than the original. That's important for VACUUM, see https://github.com/postgres/postgres/blob/d3bba041543593eb5341683107d899734dc8e73e/src/backend/access/gin/ginpostinglist.c#L55

About the dead TID list.
If I'm not mistaken, the dead TID list exists only in RAM and never on
the disk or in the network. So, what is the advantage supposed to be
achieved due to using compression in the dead TID list?

Reduces memory usage. And if it's faster to lookup than the current data structure, that too. I don't know if that works out.

About the GiST vacuuming and the use of integerset in it.
The integerset implements a tree in addition to compression.
DFoR now performs only compression. Moreover the size of a pack is
flexible (varying), which must become an issue for its usage in the
tree. It needs more thorough further elaboration to be developed.

Hmm. The integerset is a sparse list of integers, just like Frame of Reference. The tree inside it is just an implementation detail. I was thinking that you could replace the whole tree with DFoR, but I suppose you cannot do random lookups in a DFoR compressed list, so you'd still need the tree.

- Heikki



Reply via email to