On Wed, 20 May 2026 at 10:45, Maxime Schoemans
<[email protected]> wrote:
>
> Hi hackers,
>
> This patch set adds multi-entry support to the GiST and SP-GiST access
> methods, allowing a single heap tuple to produce multiple index entries --
> similar to how GIN decomposes values via extractValue, but within the
> R-tree (GiST) and quad-tree (SP-GiST) frameworks.

I think the attachments were lost in transmission.

> Motivation
> ----------
>
> The existing GiST multirange opclass compresses a multirange into its
> bounding union range, losing information about gaps. This makes operators
> like @> and <@ imprecise at the index level, requiring many false-positive
> rechecks. Multi-entry GiST instead stores each component range as a
> separate index entry, giving the R-tree much tighter bounds and
> significantly reducing rechecks for multiranges with gaps.

Cool! I'm not sure this is preferable over a specialized AM
(comparable to what GIN is to btree), but cool!

> Other design decisions:
>
[...]
> - Index-only scans are disabled on a multi-entry key column, since
>   the stored sub-entries do not represent the original datum.

How is this done? Surely it's not through a planner check of GiST opclasses?

> - For SP-GiST, compress becomes optional when extractValue is present:
>   extractValue produces the leaf-typed values directly, and leafType
>   may differ from the input type (e.g., anymultirange -> anyrange).

I don't think it's generally safe to remove compression even when
extractValue is present.

> - Since leaf consistent functions see components, a separate opclass
>   (multirange_me_ops) is provided alongside the existing multirange_ops.
>
> Patch overview
> --------------
>
> 0001 - GiST AM infrastructure (gist.h, gist_private.h, gist.c,
>   gistbuild.c, gistget.c, gistscan.c, gistutil.c, gistvalidate.c): new
>   extractValue support function (proc 13), multi-entry insert/build paths,
>   TID deduplication during scans using simplehash.

Could you help me understand how this deduplication works? Wouldn't
this possibly use ~ unbounded memory (or however much memory is needed
to store up to 2^48 TIDs)?

GIN gets away with it because it uses a lossy Bitmap structure, and
because it traverses the TID space linearly, but we can't use lossy
checks for amgettuple() without losing correctness, and I don't think
GiST can do the same.

Kind regards,

Matthias van de Meent


Reply via email to