I have a case that I though was an example of this issue, and that this patch would correct. I applied this patch to an 8.0.3 source distribution, but it didn't seem to solve my problem.
In a nutshell, I have a LIMIT query where the planner seems to favor a merge join over a nested loop. I've simplified the query as much as possible: itvtrackdata3=> \d tableA Table "public.tableA" Column | Type | Modifiers --------+----------+----------- foo | bigint | not null bar | smallint | not null bap | bigint | not null bip | bigint | not null bom | bigint | not null Indexes: "idx_tableA_bip" btree (bip) WHERE (bip = 9000000000000000000::bigint) "idx_tableA_foo" btree (foo) itvtrackdata3=> \d tableB Table "tableB" Column | Type | Modifiers ---------+----------+----------- bim | bigint | not null bif | smallint | not null baf | smallint | not null bof | smallint | not null buf | smallint | not null foo | bigint | not null Indexes: "idx_tableB_bim" btree ("bim", foo) itvtrackdata3=> set default_statistics_target to 1000; SET Time: 0.448 ms itvtrackdata3=> analyze tableA; ANALYZE Time: 4237.151 ms itvtrackdata3=> analyze tableB; ANALYZE Time: 46672.939 ms itvtrackdata3=> explain analyze SELECT * FROM tableB NATURAL JOIN tableA WHERE bim>=72555896091359 AND bim<72555935412959 AND bim=bap ORDER BY bim ASC LIMIT 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=149626.57..252987.71 rows=1 width=50) (actual time=5684.013..5684.013 rows=1 loops=1) -> Merge Join (cost=149626.57..252987.71 rows=1 width=50) (actual time=5684.012..5684.012 rows=1 loops=1) Merge Cond: (("outer"."bim" = "inner"."bap") AND ("outer".foo = "inner".foo)) -> Index Scan using idx_tableB_bim on tableB (cost=0.00..97391.22 rows=55672 width=24) (actual time=0.017..0.059 rows=29 loops=1) Index Cond: (("bim" >= 72555896091359::bigint) AND ("bim" < 72555935412959::bigint)) -> Sort (cost=149626.57..151523.94 rows=758948 width=34) (actual time=5099.300..5442.825 rows=560856 loops=1) Sort Key: tableA."bap", tableA.foo -> Seq Scan on tableA (cost=0.00..47351.48 rows=758948 width=34) (actual time=0.021..1645.204 rows=758948 loops=1) Total runtime: 5706.655 ms (9 rows) Time: 5729.984 ms itvtrackdata3=> set enable_mergejoin to false; SET Time: 0.373 ms itvtrackdata3=> explain analyze SELECT * FROM tableB NATURAL JOIN tableA WHERE bim>=72555896091359 AND bim<72555935412959 AND bim=bap ORDER BY bim ASC LIMIT 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..432619.68 rows=1 width=50) (actual time=11.149..11.150 rows=1 loops=1) -> Nested Loop (cost=0.00..432619.68 rows=1 width=50) (actual time=11.148..11.148 rows=1 loops=1) Join Filter: ("outer"."bim" = "inner"."bap") -> Index Scan using idx_tableB_bim on tableB (cost=0.00..97391.22 rows=55672 width=24) (actual time=0.017..0.062 rows=29 loops=1) Index Cond: (("bim" >= 72555896091359::bigint) AND ("bim" < 72555935412959::bigint)) -> Index Scan using idx_tableA_foo on tableA (cost=0.00..6.01 rows=1 width=34) (actual time=0.007..0.379 rows=1 loops=29) Index Cond: ("outer".foo = tableA.foo) Total runtime: 11.215 ms (8 rows) Time: 32.007 ms Have I just flubbed the patch, or is there something else going on here? Thanks, --Ian On Fri, 2005-07-22 at 12:20, Tom Lane wrote: > I wrote: > > Dawid Kuroczko <[EMAIL PROTECTED]> writes: > >> qnex=# EXPLAIN SELECT * FROM log NATURAL JOIN useragents LIMIT 1; > > >> Limit (cost=15912.20..15912.31 rows=1 width=272) > >> -> Hash Join (cost=15912.20..5328368.96 rows=47044336 width=272) > > >> If I set enable_hashjoin=false: > > >> qnex=# EXPLAIN ANALYZE SELECT * FROM log NATURAL LEFT JOIN useragents > >> LIMIT 1; > > >> Limit (cost=0.00..3.07 rows=1 width=272) (actual time=74.214..74.216 > >> rows=1 loops=1) > >> -> Nested Loop Left Join (cost=0.00..144295895.01 rows=47044336 > >> width=272) (actual time=74.204..74.204 rows=1 loops=1) > > > This is quite strange. The nestloop plan definitely should be preferred > > in the context of the LIMIT, considering that it has far lower estimated > > cost. And it is preferred in simple tests for me. > > After a suitable period of contemplating my navel, I figured out > what is going on here: the total costs involved are large enough that > the still-fairly-high startup cost of the hash is disregarded by > compare_fuzzy_path_costs(), and so the nestloop is discarded as not > having any significant potential advantage in startup time. > > I think that this refutes the original scheme of using the same fuzz > factor for both startup and total cost comparisons, and therefore > propose the attached patch. > > Comments? > > regards, tom lane > > *** src/backend/optimizer/util/pathnode.c.orig Fri Jul 15 13:09:25 2005 > --- src/backend/optimizer/util/pathnode.c Fri Jul 22 12:08:25 2005 > *************** > *** 98,157 **** > static int > compare_fuzzy_path_costs(Path *path1, Path *path2, CostSelector criterion) > { > - Cost fuzz; > - > /* > ! * The fuzz factor is set at one percent of the smaller total_cost, > ! * but not less than 0.01 cost units (just in case total cost is > ! * zero). > * > * XXX does this percentage need to be user-configurable? > */ > - fuzz = Min(path1->total_cost, path2->total_cost) * 0.01; > - fuzz = Max(fuzz, 0.01); > - > if (criterion == STARTUP_COST) > { > ! if (Abs(path1->startup_cost - path2->startup_cost) > fuzz) > ! { > ! if (path1->startup_cost < path2->startup_cost) > ! return -1; > ! else > ! return +1; > ! } > > /* > * If paths have the same startup cost (not at all unlikely), > * order them by total cost. > */ > ! if (Abs(path1->total_cost - path2->total_cost) > fuzz) > ! { > ! if (path1->total_cost < path2->total_cost) > ! return -1; > ! else > ! return +1; > ! } > } > else > { > ! if (Abs(path1->total_cost - path2->total_cost) > fuzz) > ! { > ! if (path1->total_cost < path2->total_cost) > ! return -1; > ! else > ! return +1; > ! } > > /* > * If paths have the same total cost, order them by startup > cost. > */ > ! if (Abs(path1->startup_cost - path2->startup_cost) > fuzz) > ! { > ! if (path1->startup_cost < path2->startup_cost) > ! return -1; > ! else > ! return +1; > ! } > } > return 0; > } > --- 98,138 ---- > static int > compare_fuzzy_path_costs(Path *path1, Path *path2, CostSelector criterion) > { > /* > ! * We use a fuzz factor of 1% of the smaller cost. > * > * XXX does this percentage need to be user-configurable? > */ > if (criterion == STARTUP_COST) > { > ! if (path1->startup_cost > path2->startup_cost * 1.01) > ! return +1; > ! if (path2->startup_cost > path1->startup_cost * 1.01) > ! return -1; > > /* > * If paths have the same startup cost (not at all unlikely), > * order them by total cost. > */ > ! if (path1->total_cost > path2->total_cost * 1.01) > ! return +1; > ! if (path2->total_cost > path1->total_cost * 1.01) > ! return -1; > } > else > { > ! if (path1->total_cost > path2->total_cost * 1.01) > ! return +1; > ! if (path2->total_cost > path1->total_cost * 1.01) > ! return -1; > > /* > * If paths have the same total cost, order them by startup > cost. > */ > ! if (path1->startup_cost > path2->startup_cost * 1.01) > ! return +1; > ! if (path2->startup_cost > path1->startup_cost * 1.01) > ! return -1; > } > return 0; > } > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings