On Mon, 28 Jul 2025 at 20:13, Richard Guo <guofengli...@gmail.com> wrote:
> create table t (a int, b int, c int);
> insert into t select i%3, i, i from generate_series(1,500)i;
> analyze t;
>
> explain (analyze, costs off, timing off)
> select * from t t1 join lateral
>   (select t2.a, t2.b, t1.a x from t t2, t t3 offset 0) s
>   on s.a = t1.a;
>
> with enable_memoize set to on:
>
>  Planning Time: 2.470 ms
>  Execution Time: 98869.240 ms
>
> with enable_memoize set to off:
>
>  Planning Time: 1.791 ms
>  Execution Time: 55754.080 ms

There are 2 reasons why this goes bad; 1) The row estimate is bad from
the subquery, and the Memoize costs are fooled into thinking more
items will fit in the cache than actually can fit. 2) The order that
the Memoize lookups are performed by this query matches exactly with
the last-to-be-looked-up-first-to-be-evicted method of freeing up
memory that Memoize uses and since not all items fit in the cache,
there's never any hits.

For #1, if you run with costs on, you'll see the row estimates are
pretty far out:

->  Subquery Scan on s  (cost=0.00..6267.25 rows=1250 width=12)
(actual time=0.115..52.816 rows=83334.00 loops=500)

This causes the Memoize costing code to think only 1250 rows need to
be stored per entry and that results in the costing code thinking that
all 3 unique keys can be stored concurrently. If you fix the row
estimates the Memoize costing will estimate that only 1 item will fit
in the cache, which is correct. Unfortunately, with the 3 distinct
items to lookup, Memoize still thinks that there will be a cache hit
with about every 1 in 3 lookups. That unfortunately doesn't pan out
due to the table being populated with "i%3", meaning the lookups will
have the lookup key arrive in sequences of 0, 1, 2, 0, 1, 2, etc. The
MRU evictions follow the same pattern, i.e. populating for 1 will
evict 0 and populating with 2 will evict 1, etc, resulting in zero
hits.

The current Memoize costing code assumes the lookups keys will be
evenly distributed (let's call this the average case), and when that
happens, you can see that Memoize is a win for this case:

explain analyze
select * from (select * from t order by random()) t1 join lateral
(select t2.a, t2.b, t1.a x from t t2, t t3 offset 0) s
on s.a = t1.a;

Running that I get:

Hits: 334  Misses: 166  Evictions: 164  Overflows: 0  Memory Usage: 8193kB
Execution Time: 9446.671 ms

And with memoize disabled:

Execution Time: 16547.240 ms

Having the keys in order with ORDER BY a (let's call this the best
case), you can see it does even better:

Hits: 497  Misses: 3  Evictions: 1  Overflows: 0  Memory Usage: 8193kB
Execution Time: 4041.162 ms

and about the same 16.5 seconds as above when memoize is disabled.

> Any thoughts on how we might improve this?

If we were to tune the costs for the worst-case, then we'd basically
always predict a zero percent hit ratio in all cases where there are
more distinct keys than will fit in the cache. If we did that, that
would penalise cases that get huge benefits today.

I have recently been thinking of statistics that represent the
min/max/mean distribution of values within a table and indexes
according to the order they're read in a forward scan. Perhaps
something like that could help assist the memoize costs so we have a
better idea. It might help in some cases, but certainly wouldn't be
perfect for all cases.

Another idea would be to add run-time detection in Memoize to remember
some of the most recent evicted keys, or perhaps their hash value so
that if we find we always lookup a recently evicted entry, then we
could decide to remember that one rather than continue to evict it.
That would obviously take > 0 memory, which eats into the memory for
caching tuples. I don't have a detailed idea of how this would work.
It sounds like it would be a most-commonly-used rather than a
most-recently-used cache.

On the whole, I don't really see this as a flaw in the Memoize code.
We've plenty of other cases in the planner that produce inferior plans
due to lack of enough detail or accuracy of table statistics, so I'm
not planning on rushing to look into a fix. I will keep it in mind,
however.

David


Reply via email to