Re: Reject Foreign Tables from MIN/MAX indexscan Optimization?

David E. Wheeler Tue, 06 Jan 2026 13:18:45 -0800

On Jan 5, 2026, at 23:04, Tom Lane <[email protected]> wrote:

> Yeah, there might be some gold to be mined there.  IIRC we already
> have some ability for LIMIT to tell an immediately-below Sort that
> only a limited number of rows will be retrieved.  Could be useful
> if ForeignScan could be in on that.


I’m surprised the limit part of the plan isn’t there already. Is that something 
that needs to be tweaked in the paths passed to the FDW?

> The first attached results are from current HEAD; the second are
> after applying your patch.  What I find interesting is that
> with use_remote_estimate enabled, it always goes for shipping the
> MIN as-is.  Your patch changes the behavior with use_remote_estimate
> disabled, and we can see that the reason is that that mode estimates
> the query cost at 100.24 with the minmax optimization and 146.78 when
> shipping the aggregate.  I've not dug into where we are getting
> those numbers without use_remote_estimate, but perhaps that could
> use refinement.

Odd. They should be the same, no?

> On the other hand, there is not that much difference
> between the actual runtimes, so perhaps it doesn't matter for
> postgres_fdw.

Maybe not with a loopback connection, but some of these queries send all 10000 
values over the network, no? That could be a lot of additional latency in some 
situations. ISTM that the code should either:

1. Push down the `LIMIT 1`; or
2. Push down the min/max and let the optimization happen remotely

> I can't speak to your results with ClickHouse, since I have no idea
> how the cost estimates are derived for that.

ClickHouse is column-oriented, and all the columns are sorted, so min/max 
optimizations are completely different and never require an index scan or table 
scan: they just read the first or last value in the relevant column. This is 
why I started looking into this in the first place.

But your finding of the difference between use_remote_estimate true and false 
helped me to figure out the right lever to pull. pg_clickhouse has some 
remnants of use_remote_estimate from forking postgres_fdw back in 2019; they 
don’t do anything, but it led me to a function that sets more-or-less 
hard-coded estimates. In `estimate_path_cost_size()` I changed:

    p_total_cost = 5.0 + coef

To

    p_total_cost = 0.1 + coef

And that makes the min and max push down. Of course this impacts *all* 
pg_clickhouse queries, but that’s mainly what we want for analytics-oriented 
workloads.

Anyway, some comments on plans generated by your testing:

> explain (analyze, verbose) select min(c1) from "S 1"."T 3";
> explain (analyze, verbose) select min(c1) from "S 1"."T 3";
> explain (analyze, verbose) select min(c1) from "S 1"."T 3";

Index-only scan, 1 row (though estimates 10000 for some reason).

> explain (analyze, verbose) select min(c2) from "S 1"."T 3";
> explain (analyze, verbose) select min(c2) from "S 1"."T 3";
> explain (analyze, verbose) select min(c2) from "S 1"."T 3";

Table scan, 10000 rows, as expected.

> ALTER FOREIGN TABLE ft4 OPTIONS (use_remote_estimate 'false');
> 
> explain (analyze, verbose) select min(c1) from ft4;
> explain (analyze, verbose) select min(c1) from ft4;
> explain (analyze, verbose) select min(c1) from ft4;

The Foreign Scan here confuses me:

->  Foreign Scan on public.ft4  (cost=100.00..761.35 rows=2925 width=4) (actual 
time=0.144..0.144 rows=1.00 loops=1)
    Output: ft4.c1
    Remote SQL: SELECT c1 FROM "S 1"."T 3" ORDER BY c1 ASC NULLS LAST

How does it result in only one row? Does that mean it gets all of them back 
from the seq scan but only reads the first? Are they still delivered over the 
network and buffered? Seems like it would be good to have `LIMIT 1` pushed down.

> explain (analyze, verbose) select min(c2) from ft4;
> explain (analyze, verbose) select min(c2) from ft4;
> explain (analyze, verbose) select min(c2) from ft4;

Same confusion about the foreign scan plan:

->  Foreign Scan on public.ft4  (cost=100.00..761.35 rows=2925 width=4) (actual 
time=0.831..0.831 rows=1.00 loops=1)
    Output: ft4.c2
    Remote SQL: SELECT c2 FROM "S 1"."T 3" ORDER BY c2 ASC NULLS LAST

But also its execution time is 0.900 ms, which implies a remote full table 
scan. This makes sense, and `LIMIT 1` would reduce the network overhead, but 
not the remote execution time.

> ALTER FOREIGN TABLE ft4 OPTIONS (SET use_remote_estimate 'true');
> 
> explain (analyze, verbose) select min(c1) from ft4;
> explain (analyze, verbose) select min(c1) from ft4;
> explain (analyze, verbose) select min(c1) from ft4;

> explain (analyze, verbose) select min(c2) from ft4;
> explain (analyze, verbose) select min(c2) from ft4;
> explain (analyze, verbose) select min(c2) from ft4;

Interesting that this configuration pushes down min/max, but of course then the 
same optimizations happen remotely as locally, but now we know only one row is 
returned, right?

My patch doesn’t change this much, because the remote optimization still takes 
place, it’s just that in that case it never returns more than a single row. 
Setting `LIMIT` pushdown aside, I guess the question is, how ought an FDW 
author to properly tweak cost estimates to encourage or discourage the min/max 
optimization?

Best,

David

signature.asc
Description: Message signed with OpenPGP

Re: Reject Foreign Tables from MIN/MAX indexscan Optimization?

Reply via email to