Re: New access method for b-tree.

Alexandre Felipe Wed, 04 Feb 2026 22:59:59 -0800

Thank you for looking into this.

Now we can execute a, still narrow, family queries!

Maybe it helps to see this as a *social network feeds*. Imagine a social
network, you have a few friends, or follow a few people, and you want to
see their updates ordered by date. For each user we have a different
combination of users that we have to display. But maybe, even having
hundreds of users we will only show the first 10.

There is a low hanging fruit on the skip scan, if we need N rows, and one
group already has M rows we could stop there.
If Nx is the number of friends, and M is the number of posts to show.
This runs with complexity (Nx * M) rows, followed by an (Nx * M) sort,
instead of (Nx * N) followed by an (Nx * N) sort.
Where M = 10 and N is 1000 this is a significant improvement.
But if M ~ N, the merge scan that runs with M + Nx row accesses, (M + Nx)
heap operations.
If everything is on the same page the skip scan would win.

The cost estimation is probably far off.
I am also not considering the filters applied after this operator, and I
don't know if the planner infrastructure is able to adjust it by itself.
This is where I would like reviewer's feedback. I think that the planner
costs are something to be determined experimentally.

Next I will make it slightly more general handling
* More index columns: Index (a, b, s...) could support WHERE a IN (...)
ORDER BY b LIMIT N (ignoring s...)
* Multi-column prefix: WHERE (a, b) IN (...) ORDER BY c
* Non-leading prefix: WHERE b IN (...) AND a = const ORDER BY c on index
(a, b, c)

---
Kind Regards,
Alexandre

On Wed, Feb 4, 2026 at 7:13 AM Michał Kłeczek <[email protected]> wrote:

>
>
> On 3 Feb 2026, at 22:42, Ants Aasma <[email protected]> wrote:
>
> On Mon, 2 Feb 2026 at 01:54, Tomas Vondra <[email protected]> wrote:
>
> I'm also wondering how common is the targeted query pattern? How common
> it is to have an IN condition on the leading column in an index, and
> ORDER BY on the second one?
>
>
> I have seen this pattern multiple times. My nickname for it is the
> timeline view. Think of the social media timeline, showing posts from
> all followed accounts in timestamp order, returned in reasonably sized
> batches. The naive SQL query will have to scan all posts from all
> followed accounts and pass them through a top-N sort. When the total
> number of posts is much larger than the batch size this is much slower
> than what is proposed here (assuming I understand it correctly) -
> effectively equivalent to running N index scans through Merge Append.
>
>
> My workarounds I have proposed users have been either to rewrite the
> query as a UNION ALL of a set of single value prefix queries wrapped
> in an order by limit. This gives the exact needed merge append plan
> shape. But repeating the query N times can get unwieldy when the
> number of values grows, so the fallback is:
>
> SELECT * FROM unnest(:friends) id, LATERAL (
>    SELECT * FROM posts
>    WHERE user_id = id
>    ORDER BY tstamp DESC LIMIT 100)
> ORDER BY tstamp DESC LIMIT 100;
>
> The downside of this formulation is that we still have to fetch a
> batch worth of items from scans where we otherwise would have only had
> to look at one index tuple.
>
>
> GIST can be used to handle this kind of queries as it supports multiple
> sort orders.
> The only problem is that GIST does not support ORDER BY column.
> One possible workaround is [1] but as described there it does not play
> well with partitioning.
> I’ve started drafting support for ORDER BY column in GIST - see [2].
> I think it would be easier to implement and maintain than a new IAM (but I
> don’t have enough knowledge and experience to implement it myself)
>
> [1]
> https://www.postgresql.org/message-id/3FA1E0A9-8393-41F6-88BD-62EEEA1EC21F%40kleczek.org
> [2]
> https://www.postgresql.org/message-id/B2AC13F9-6655-4E27-BFD3-068844E5DC91%40kleczek.org
>
> —
> Kind regards,
> Michal
>

0002-MERGE-SCAN-Access-method.patch
Description: Binary data

0003-MERGE-SCAN-Planner-integration.patch
Description: Binary data

0001-MERGE-SCAN-Test-the-baseline.patch
Description: Binary data

Re: New access method for b-tree.

Reply via email to