On 08.09.2025 13:56, David Geier wrote:
To evaluate it, I ran benchmarks on JOB with three variants:

$ ./benchmark.sh master
$ ./benchmark.sh merge
$ ./benchmark.sh hash

I compared total planning time across all 113 queries.
Was this running with optimizations? How did you extract the planning time?


I save all query plans using EXPLAIN SUMMARY, then go through all the plans, read the 'Planning Time' for each, and sum them up.

I would have expected the delta between the "merge" and "hash" variant
to be bigger, especially for default_statistics_target=10000. My small
test also showed that. Any idea why this is not showing in your results?


So would I. With default_statistics_target = 10000 and the selectivity in the JOB queries being close to zero, the difference should be noticeable. I can only explain the previous results by cache-related effects on my machine.

I reran the benchmark on a clean cluster and collected the top slowest JOB queries — now the effect is clearly visible.

Merge (sum of all JOB queries)
==================
default_statistics_target | Planner Speedup (×) | Planner Before (ms) | Planner After (ms)
--------------------------------------------------------------------------------
100                       | *1.00*                | 1888.105            | 1879.431 1000                      | *1.14*                | 2282.239            | 2009.114 2500                      | *2.10*                | 5595.030            | 2668.530 5000                      | *5.56*                | 18544.933           | 3333.252 7500                      | *9.17*                | 37390.956           | 4076.390 10000                     | *16.10*               | 69319.479           | 4306.417

HashMap (sum of all JOB queries)
==================
default_statistics_target | Planner Speedup (×) | Planner Before (ms) | Planner After (ms)
--------------------------------------------------------------------------------
100                     | *1.03*                | 1888.105            | 1828.088 1000                    | *1.18*                | 2282.239            | 1939.884 2500                    | *2.64*                | 5595.030            | 2117.872 5000                    | *7.80*                | 18544.933           | 2377.206 7500                    | *13.80*               | 37390.956           | 2709.973 10000                   | *23.32*               | 69319.479           | 2973.073

Top 10 slowest JOB queries (default_statistics_target = 10000)
Query | master (ms) | merge (ms) | Hash (ms)
------+-------------+------------+-----------
29c   | 1904.586    | 144.135    | 100.473
29b   | 1881.392    | 117.891    | 89.028
29a   | 1868.805    | 112.242    | 83.913
31c   | 1867.234    | 76.498     | 56.140
30c   | 1646.630    | 88.494     | 62.549
30b   | 1608.820    | 84.821     | 64.603
31a   | 1573.964    | 75.978     | 56.140
28a   | 1457.738    | 95.939     | 77.309
28b   | 1455.052    | 99.383     | 73.065
30a   | 1416.699    | 91.057     | 62.549


BTW, the hashmap from your patch could also be applied to eqjoinsel_semi() function.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com

Reply via email to