On May 22, 2019, at 11:42 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Donald Dong <xd...@csumb.edu> writes: >> I find the cost from cheapest_total_path->total_cost is different >> from the cost from queryDesc->planstate->total_cost. What I saw was >> that GEQO tends to form paths with lower >> cheapest_total_path->total_cost (aka the fitness of the children). >> However, standard_join_search is more likely to produce a lower >> queryDesc->planstate->total_cost, which is the cost we get using >> explain. > >> I wonder why those two total costs are different? If the total_cost >> from the planstate is more accurate, could we use that instead as the >> fitness in geqo_eval? > > You're still asking us to answer hypothetical questions unsupported > by evidence. In what case does that really happen?
Hi, My apologies if this is not the minimal necessary set up. But here's more information about what I saw using the following query (JOB/1a.sql): SELECT MIN(mc.note) AS production_note, MIN(t.title) AS movie_title, MIN(t.production_year) AS movie_year FROM company_type AS ct, info_type AS it, movie_companies AS mc, movie_info_idx AS mi_idx, title AS t WHERE ct.kind = 'production companies' AND it.info = 'top 250 rank' AND mc.note NOT LIKE '%(as Metro-Goldwyn-Mayer Pictures)%' AND (mc.note LIKE '%(co-production)%' OR mc.note LIKE '%(presents)%') AND ct.id = mc.company_type_id AND t.id = mc.movie_id AND t.id = mi_idx.movie_id AND mc.movie_id = mi_idx.movie_id AND it.id = mi_idx.info_type_id; I attached the query plan and debug_print_rel output for GEQO and standard_join_search. planstate->total_cost cheapest_total_path GEQO 54190.13 54239.03 STD 54179.02 54273.73 Here I observe GEQO produces a lower cheapest_total_path->total_cost, but its planstate->total_cost is higher than what standard_join_search produces. Regards, Donald Dong
Finalize Aggregate (cost=54190.12..54190.13 rows=1 width=68) -> Gather (cost=54189.90..54190.11 rows=2 width=68) Workers Planned: 2 -> Partial Aggregate (cost=53189.90..53189.91 rows=1 width=68) -> Nested Loop (cost=15318.71..53189.55 rows=46 width=45) Join Filter: (mc.movie_id = t.id) -> Hash Join (cost=15318.28..53162.39 rows=46 width=32) Hash Cond: (mc.company_type_id = ct.id) -> Parallel Hash Join (cost=15317.21..53160.33 rows=185 width=36) Hash Cond: (mc.movie_id = mi_idx.movie_id) -> Parallel Seq Scan on movie_companies mc (cost=0.00..37814.90 rows=7320 width=32) Filter: (((note)::text !~~ '%(as Metro-Goldwyn-Mayer Pictures)%'::text) AND (((note)::text ~~ '%(co-production)%'::text) OR ((note)::text ~~ ' %(presents)%'::text))) -> Parallel Hash (cost=15253.60..15253.60 rows=5089 width=4) -> Hash Join (cost=2.43..15253.60 rows=5089 width=4) Hash Cond: (mi_idx.info_type_id = it.id) -> Parallel Seq Scan on movie_info_idx mi_idx (cost=0.00..13685.15 rows=575015 width=8) -> Hash (cost=2.41..2.41 rows=1 width=4) -> Seq Scan on info_type it (cost=0.00..2.41 rows=1 width=4) Filter: ((info)::text = 'top 250 rank'::text) -> Hash (cost=1.05..1.05 rows=1 width=4) -> Seq Scan on company_type ct (cost=0.00..1.05 rows=1 width=4) Filter: ((kind)::text = 'production companies'::text) -> Index Scan using title_pkey on title t (cost=0.43..0.58 rows=1 width=25) Index Cond: (id = mi_idx.movie_id)
Finalize Aggregate (cost=54179.01..54179.02 rows=1 width=68) -> Gather (cost=54178.79..54179.00 rows=2 width=68) Workers Planned: 2 -> Partial Aggregate (cost=53178.79..53178.80 rows=1 width=68) -> Nested Loop (cost=37881.27..53178.44 rows=46 width=45) Join Filter: (mc.movie_id = t.id) -> Parallel Hash Join (cost=37880.84..53151.28 rows=46 width=32) Hash Cond: (mi_idx.movie_id = mc.movie_id) -> Hash Join (cost=2.43..15253.60 rows=5089 width=4) Hash Cond: (mi_idx.info_type_id = it.id) -> Parallel Seq Scan on movie_info_idx mi_idx (cost=0.00..13685.15 rows=575015 width=8) -> Hash (cost=2.41..2.41 rows=1 width=4) -> Seq Scan on info_type it (cost=0.00..2.41 rows=1 width=4) Filter: ((info)::text = 'top 250 rank'::text) -> Parallel Hash (cost=37855.54..37855.54 rows=1830 width=28) -> Hash Join (cost=1.06..37855.54 rows=1830 width=28) Hash Cond: (mc.company_type_id = ct.id) -> Parallel Seq Scan on movie_companies mc (cost=0.00..37814.90 rows=7320 width=32) Filter: (((note)::text !~~ '%(as Metro-Goldwyn-Mayer Pictures)%'::text) AND (((note)::text ~~ '%(co-production)%'::text) OR ((note)::tex t ~~ '%(presents)%'::text))) -> Hash (cost=1.05..1.05 rows=1 width=4) -> Seq Scan on company_type ct (cost=0.00..1.05 rows=1 width=4) Filter: ((kind)::text = 'production companies'::text) -> Index Scan using title_pkey on title t (cost=0.43..0.58 rows=1 width=25) Index Cond: (id = mi_idx.movie_id)
cheapest total path: NestLoop(ct it mc mi_idx t) rows=111 cost=16318.71..54273.73 clauses: mc.movie_id = t.id HashJoin(ct it mc mi_idx) rows=111 cost=16318.28..54208.19 clauses: mc.company_type_id = ct.id Gather(it mc mi_idx) rows=444 cost=16317.21..54204.73 HashJoin(it mc mi_idx) rows=185 cost=15317.21..53160.33 clauses: mi_idx.movie_id = mc.movie_id SeqScan(mc) rows=7320 cost=0.00..37814.90 HashJoin(it mi_idx) rows=5089 cost=2.43..15253.60 clauses: it.id = mi_idx.info_type_id SeqScan(mi_idx) rows=575015 cost=0.00..13685.15 SeqScan(it) rows=1 cost=0.00..2.41 SeqScan(ct) rows=1 cost=0.00..1.05 IdxScan(t) required_outer (mi_idx) rows=1 cost=0.43..0.58 pathkeys: ((t.id, mc.movie_id, mi_idx.movie_id))
cheapest total path: NestLoop(ct it mc mi_idx t) rows=111 cost=16318.71..54239.03 clauses: mc.movie_id = t.id Gather(ct it mc mi_idx) rows=111 cost=16318.28..54173.49 HashJoin(ct it mc mi_idx) rows=46 cost=15318.28..53162.39 clauses: mc.company_type_id = ct.id HashJoin(it mc mi_idx) rows=185 cost=15317.21..53160.33 clauses: mc.movie_id = mi_idx.movie_id SeqScan(mc) rows=7320 cost=0.00..37814.90 HashJoin(it mi_idx) rows=5089 cost=2.43..15253.60 clauses: it.id = mi_idx.info_type_id SeqScan(mi_idx) rows=575015 cost=0.00..13685.15 SeqScan(it) rows=1 cost=0.00..2.41 SeqScan(ct) rows=1 cost=0.00..1.05 IdxScan(t) required_outer (mi_idx) rows=1 cost=0.43..0.58 pathkeys: ((t.id, mc.movie_id, mi_idx.movie_id))