Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

David Geier Mon, 14 Nov 2022 01:19:37 -0800

Hi Tom,

There won't *be* any MCV stats for a column that ANALYZE perceives to
be unique, so I'm not quite sure where the claimed savings comes from.

We save if one join attribute is unique while the other isn't. In thatcase stored MCV stats are read for the non-unique attribute but thennever used. This is because MCV stats in join selectivity estimation areonly used if they're present on both columns

Please provide a concrete example.

A super simple case already showing a significant speedup is thefollowing. The more ways to join two tables and the more joins overall,the higher the expected gain.


CREATE TABLE bar(col INT UNIQUE);
CREATE TABLE foo (col INT);
INSERT INTO foo SELECT generate_series(1, 1000000, 0.5);
SET default_statistics_target = 10000;
ANALYZE foo, bar;
\timing on
EXPLAIN SELECT * FROM foo, bar WHERE foo.col = bar.col;

Running the above query five times gave me average runtimes of:

- 0.62 ms without the patch and
- 0.48 ms with the patch.

--
David Geier
(ServiceNow)

Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes

Reply via email to