On May 22, 2019, at 11:42 AM, Tom Lane <[email protected]> wrote:
>
> Donald Dong <[email protected]> writes:
>> I find the cost from cheapest_total_path->total_cost is different
>> from the cost from queryDesc->planstate->total_cost. What I saw was
>> that GEQO tends to form paths with lower
>> cheapest_total_path->total_cost (aka the fitness of the children).
>> However, standard_join_search is more likely to produce a lower
>> queryDesc->planstate->total_cost, which is the cost we get using
>> explain.
>
>> I wonder why those two total costs are different? If the total_cost
>> from the planstate is more accurate, could we use that instead as the
>> fitness in geqo_eval?
>
> You're still asking us to answer hypothetical questions unsupported
> by evidence. In what case does that really happen?
Hi,
My apologies if this is not the minimal necessary set up. But here's
more information about what I saw using the following query
(JOB/1a.sql):
SELECT MIN(mc.note) AS production_note,
MIN(t.title) AS movie_title,
MIN(t.production_year) AS movie_year
FROM company_type AS ct,
info_type AS it,
movie_companies AS mc,
movie_info_idx AS mi_idx,
title AS t
WHERE ct.kind = 'production companies'
AND it.info = 'top 250 rank'
AND mc.note NOT LIKE '%(as Metro-Goldwyn-Mayer Pictures)%'
AND (mc.note LIKE '%(co-production)%'
OR mc.note LIKE '%(presents)%')
AND ct.id = mc.company_type_id
AND t.id = mc.movie_id
AND t.id = mi_idx.movie_id
AND mc.movie_id = mi_idx.movie_id
AND it.id = mi_idx.info_type_id;
I attached the query plan and debug_print_rel output for GEQO and
standard_join_search.
planstate->total_cost cheapest_total_path
GEQO 54190.13 54239.03
STD 54179.02 54273.73
Here I observe GEQO produces a lower
cheapest_total_path->total_cost, but its planstate->total_cost is higher
than what standard_join_search produces.
Regards,
Donald Dong
Finalize Aggregate (cost=54190.12..54190.13 rows=1 width=68)
-> Gather (cost=54189.90..54190.11 rows=2 width=68)
Workers Planned: 2
-> Partial Aggregate (cost=53189.90..53189.91 rows=1 width=68)
-> Nested Loop (cost=15318.71..53189.55 rows=46 width=45)
Join Filter: (mc.movie_id = t.id)
-> Hash Join (cost=15318.28..53162.39 rows=46 width=32)
Hash Cond: (mc.company_type_id = ct.id)
-> Parallel Hash Join (cost=15317.21..53160.33
rows=185 width=36)
Hash Cond: (mc.movie_id = mi_idx.movie_id)
-> Parallel Seq Scan on movie_companies mc
(cost=0.00..37814.90 rows=7320 width=32)
Filter: (((note)::text !~~ '%(as
Metro-Goldwyn-Mayer Pictures)%'::text) AND (((note)::text ~~
'%(co-production)%'::text) OR ((note)::text ~~ ' %(presents)%'::text)))
-> Parallel Hash (cost=15253.60..15253.60
rows=5089 width=4)
-> Hash Join (cost=2.43..15253.60
rows=5089 width=4)
Hash Cond: (mi_idx.info_type_id =
it.id)
-> Parallel Seq Scan on
movie_info_idx mi_idx (cost=0.00..13685.15 rows=575015 width=8)
-> Hash (cost=2.41..2.41 rows=1
width=4)
-> Seq Scan on info_type it
(cost=0.00..2.41 rows=1 width=4)
Filter: ((info)::text
= 'top 250 rank'::text)
-> Hash (cost=1.05..1.05 rows=1 width=4)
-> Seq Scan on company_type ct
(cost=0.00..1.05 rows=1 width=4)
Filter: ((kind)::text = 'production
companies'::text)
-> Index Scan using title_pkey on title t
(cost=0.43..0.58 rows=1 width=25)
Index Cond: (id = mi_idx.movie_id)
Finalize Aggregate (cost=54179.01..54179.02 rows=1 width=68)
-> Gather (cost=54178.79..54179.00 rows=2 width=68)
Workers Planned: 2
-> Partial Aggregate (cost=53178.79..53178.80 rows=1 width=68)
-> Nested Loop (cost=37881.27..53178.44 rows=46 width=45)
Join Filter: (mc.movie_id = t.id)
-> Parallel Hash Join (cost=37880.84..53151.28 rows=46
width=32)
Hash Cond: (mi_idx.movie_id = mc.movie_id)
-> Hash Join (cost=2.43..15253.60 rows=5089
width=4)
Hash Cond: (mi_idx.info_type_id = it.id)
-> Parallel Seq Scan on movie_info_idx mi_idx
(cost=0.00..13685.15 rows=575015 width=8)
-> Hash (cost=2.41..2.41 rows=1 width=4)
-> Seq Scan on info_type it
(cost=0.00..2.41 rows=1 width=4)
Filter: ((info)::text = 'top 250
rank'::text)
-> Parallel Hash (cost=37855.54..37855.54
rows=1830 width=28)
-> Hash Join (cost=1.06..37855.54 rows=1830
width=28)
Hash Cond: (mc.company_type_id = ct.id)
-> Parallel Seq Scan on movie_companies
mc (cost=0.00..37814.90 rows=7320 width=32)
Filter: (((note)::text !~~ '%(as
Metro-Goldwyn-Mayer Pictures)%'::text) AND (((note)::text ~~
'%(co-production)%'::text) OR ((note)::tex
t ~~ '%(presents)%'::text)))
-> Hash (cost=1.05..1.05 rows=1
width=4)
-> Seq Scan on company_type ct
(cost=0.00..1.05 rows=1 width=4)
Filter: ((kind)::text =
'production companies'::text)
-> Index Scan using title_pkey on title t
(cost=0.43..0.58 rows=1 width=25)
Index Cond: (id = mi_idx.movie_id)
cheapest total path:
NestLoop(ct it mc mi_idx t) rows=111 cost=16318.71..54273.73
clauses: mc.movie_id = t.id
HashJoin(ct it mc mi_idx) rows=111 cost=16318.28..54208.19
clauses: mc.company_type_id = ct.id
Gather(it mc mi_idx) rows=444 cost=16317.21..54204.73
HashJoin(it mc mi_idx) rows=185 cost=15317.21..53160.33
clauses: mi_idx.movie_id = mc.movie_id
SeqScan(mc) rows=7320 cost=0.00..37814.90
HashJoin(it mi_idx) rows=5089
cost=2.43..15253.60
clauses: it.id = mi_idx.info_type_id
SeqScan(mi_idx) rows=575015
cost=0.00..13685.15
SeqScan(it) rows=1 cost=0.00..2.41
SeqScan(ct) rows=1 cost=0.00..1.05
IdxScan(t) required_outer (mi_idx) rows=1 cost=0.43..0.58
pathkeys: ((t.id, mc.movie_id, mi_idx.movie_id))
cheapest total path:
NestLoop(ct it mc mi_idx t) rows=111 cost=16318.71..54239.03
clauses: mc.movie_id = t.id
Gather(ct it mc mi_idx) rows=111 cost=16318.28..54173.49
HashJoin(ct it mc mi_idx) rows=46 cost=15318.28..53162.39
clauses: mc.company_type_id = ct.id
HashJoin(it mc mi_idx) rows=185 cost=15317.21..53160.33
clauses: mc.movie_id = mi_idx.movie_id
SeqScan(mc) rows=7320 cost=0.00..37814.90
HashJoin(it mi_idx) rows=5089
cost=2.43..15253.60
clauses: it.id = mi_idx.info_type_id
SeqScan(mi_idx) rows=575015
cost=0.00..13685.15
SeqScan(it) rows=1 cost=0.00..2.41
SeqScan(ct) rows=1 cost=0.00..1.05
IdxScan(t) required_outer (mi_idx) rows=1 cost=0.43..0.58
pathkeys: ((t.id, mc.movie_id, mi_idx.movie_id))