Re: Add a greedy join search algorithm to handle large join problems

Dilip Kumar Mon, 01 Dec 2025 21:37:30 -0800

On Tue, Dec 2, 2025 at 9:18 AM Chengpeng Yan <[email protected]> wrote:
>
> Hi hackers,
>
> This patch implements GOO (Greedy Operator Ordering), a greedy
> join-order search method for large join problems, based on Fegaras (DEXA
> ’98) [1]. The algorithm repeatedly selects, among all legal joins, the
> join pair with the lowest estimated total cost, merges them, and
> continues until a single join remains. Patch attached.


Interesting.

> To get an initial sense of performance, I reused the star join /
> snowflake examples and the testing script from the thread in [2]. The
> star-join GUC in that SQL workload was replaced with
> `enable_goo_join_search`, so the same tests can run under DP (standard
> dynamic programming) / GEQO(Genetic Query Optimizer) / GOO. For these
> tests, geqo_threshold was set to 15 for DP, and to 5 for both GEQO and
> GOO. Other planner settings, including join_collapse_limit, remained at
> their defaults.
>
> On my local machine, a single-client pgbench run produces the following
> throughput (tps):
>
>                     |    DP    |   GEQO   |    GOO
> --------------------+----------+----------+-----------
> starjoin    (inner) |  1762.52 |  192.13  |  6168.89
> starjoin    (outer) |  1683.92 |  173.90  |  5626.56
> snowflake   (inner) |  1829.04 |  133.40  |  3929.57
> snowflake   (outer) |  1397.93 |   99.65  |  3040.52
>

Is pgbench the right workload to test this, I mean what are we trying
to compare here the planning time taken by DP vs GEQO vs GOO or the
quality of the plans generated by different join ordering algorithms
or both?  All pgbench queries are single table scans and there is no
involvement of the join search, so I am not sure how we can justify
these gains?


-- 
Regards,
Dilip Kumar
Google

Re: Add a greedy join search algorithm to handle large join problems

Reply via email to