Hi,

Below are two query plan for same SQL with and without an index. I noticed the 
Hash join order has changed since index has been created and this is not what I 
want. As it's hashing the big table and to provoke records in  a small table. 
in Oracle, it's simple to add hint to point the table you'd like to be used as 
the provoke table. However, in Postgres, I don't know how to change the 
behavior.

--plan 1, 10 seconds were spent on sequential scan on term_weekly table.

dev=# explain analyze select distinct  cs_id from lookup_weekly  n inner join 
term_weekly s on s.b_id=n.b_id and s.date=n.date where term in ('cat'::text);


                                                                           
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate  (cost=2100211.06..2100211.11 rows=5 width=4) (actual 
time=27095.470..27095.487 rows=138 loops=1)
   ->  Hash Join  (cost=954343.95..2100211.04 rows=5 width=4) (actual 
time=24088.912..27095.206 rows=160 loops=1)
         Hash Cond: (((n.b_id)::text = (s.b_id)::text) AND (n.date = s.date))
         ->  Append  (cost=0.00..862153.59 rows=37828460 width=52) (actual 
time=0.016..10923.091 rows=37828459 loops=1)
               ->  Seq Scan on lookup_weekly n  (cost=0.00..0.00 rows=1 
width=524) (actual time=0.001..0.001 rows=0 loops=1)
               ->  Seq Scan on lookup_weekly_20131130 n_1  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.015..1229.217 
rows=5158718 loops=1)
               ->  Seq Scan on lookup_weekly_20131207 n_2  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=5.225..1177.539 
rows=5158718 loops=1)
               ->  Seq Scan on lookup_weekly_20131214 n_3  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=5.756..1274.135 
rows=5158718 loops=1)
               ->  Seq Scan on lookup_weekly_20131221 n_4  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=4.269..1131.570 
rows=5158718 loops=1)
               ->  Seq Scan on lookup_weekly_20131228 n_5  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=9.383..1110.435 
rows=5158718 loops=1)
               ->  Seq Scan on lookup_weekly_20140426 n_6  (cost=0.00..91715.42 
rows=4042442 width=52) (actual time=8.137..947.724 rows=4042442 loops=1)
               ->  Seq Scan on lookup_weekly_20140503 n_7  (cost=0.00..93516.49 
rows=4118149 width=52) (actual time=7.717..791.339 rows=4118149 loops=1)
               ->  Seq Scan on lookup_weekly_20140329 n_8  (cost=0.00..88100.78 
rows=3874278 width=52) (actual time=0.004..637.297 rows=3874278 loops=1)
         ->  Hash  (cost=954343.47..954343.47 rows=32 width=61) (actual 
time=10604.327..10604.327 rows=553 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 43kB
               ->  Append  (cost=0.00..954343.47 rows=32 width=61) (actual 
time=10.009..10602.075 rows=553 loops=1)
                     ->  Seq Scan on term_weekly s  (cost=0.00..0.00 rows=1 
width=520) (actual time=0.000..0.000 rows=0 loops=1)
                           Filter: (term = 'cat'::text)
                     ->  Seq Scan on term_weekly_20140503 s_1  
(cost=0.00..262030.12 rows=8 width=46) (actual time=10.007..3738.945 rows=166 
loops=1)
                           Filter: (term = 'cat'::text)
                           Rows Removed by Filter: 8516324
                     ->  Seq Scan on term_weekly_20140510 s_2  
(cost=0.00..246131.35 rows=8 width=46) (actual time=52.059..2316.793 rows=152 
loops=1)
                           Filter: (term = 'cat'::text)
                           Rows Removed by Filter: 8010196
                     ->  Seq Scan on term_weekly_20140517 s_3  
(cost=0.00..233644.94 rows=8 width=46) (actual time=26.661..2504.273 rows=135 
loops=1)
                           Filter: (term = 'cat'::text)
                           Rows Removed by Filter: 7632420
                     ->  Seq Scan on term_weekly_20140524 s_4  
(cost=0.00..212537.06 rows=7 width=46) (actual time=49.773..2041.578 rows=100 
loops=1)
                           Filter: (term = 'cat'::text)
                           Rows Removed by Filter: 6950865
Total runtime: 27095.639 ms
(31 rows)

--plan 2, only 1 second spent on index scan of term_weekly table, however, as 
it selects the big table to do the hashing, it takes 22 seconds for the hash to 
complete. The advantage get from index has been totally lost because of this 
join order.

                                                                                
        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate  (cost=1429795.17..1429795.22 rows=5 width=4) (actual 
time=22991.289..22991.307 rows=138 loops=1)
   ->  Hash Join  (cost=1429580.49..1429795.15 rows=5 width=4) (actual 
time=22963.340..22991.214 rows=160 loops=1)
         Hash Cond: (((s.b_id)::text = (n.b_id)::text) AND (s.date = n.date))
         ->  Append  (cost=0.00..142.77 rows=32 width=61) (actual 
time=0.052..1.125 rows=553 loops=1)
               ->  Seq Scan on term_weekly s  (cost=0.00..0.00 rows=1 
width=520) (actual time=0.001..0.001 rows=0 loops=1)
                     Filter: (term = 'cat'::text)
               ->  Index Scan using idx_term_weekly_20140503_3 on 
term_weekly_20140503 s_1  (cost=0.56..36.70 rows=8 width=46) (actual 
time=0.051..0.353 rows=166 loops=1)
                     Index Cond: (term = 'cat'::text)
               ->  Index Scan using idx_term_weekly_20140510_3 on 
term_weekly_20140510 s_2  (cost=0.56..36.70 rows=8 width=46) (actual 
time=0.043..0.293 rows=152 loops=1)
                     Index Cond: (term = 'cat'::text)
               ->  Index Scan using idx_term_weekly_20140517_3 on 
term_weekly_20140517 s_3  (cost=0.56..36.70 rows=8 width=46) (actual 
time=0.029..0.244 rows=135 loops=1)
                     Index Cond: (term = 'cat'::text)
               ->  Index Scan using idx_term_weekly_20140524_3 on 
term_weekly_20140524 s_4  (cost=0.56..32.68 rows=7 width=46) (actual 
time=0.024..0.192 rows=100 loops=1)
                     Index Cond: (term = 'cat'::text)
         ->  Hash  (cost=862153.59..862153.59 rows=37828460 width=52) (actual 
time=22939.457..22939.457 rows=37828459 loops=1)
               Buckets: 4194304  Batches: 1  Memory Usage: 3144960kB
               ->  Append  (cost=0.00..862153.59 rows=37828460 width=52) 
(actual time=0.010..9100.690 rows=37828459 loops=1)
                     ->  Seq Scan on lookup_weekly n  (cost=0.00..0.00 rows=1 
width=524) (actual time=0.001..0.001 rows=0 loops=1)
                     ->  Seq Scan on lookup_weekly_20131130 n_1  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.008..1099.194 
rows=5158718 loops=1)
                     ->  Seq Scan on lookup_weekly_20131207 n_2  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.004..861.678 
rows=5158718 loops=1)
                     ->  Seq Scan on lookup_weekly_20131214 n_3  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.004..860.374 
rows=5158718 loops=1)
                     ->  Seq Scan on lookup_weekly_20131221 n_4  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.003..852.169 
rows=5158718 loops=1)
                     ->  Seq Scan on lookup_weekly_20131228 n_5  
(cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.005..835.201 
rows=5158718 loops=1)
                     ->  Seq Scan on lookup_weekly_20140426 n_6  
(cost=0.00..91715.42 rows=4042442 width=52) (actual time=0.005..663.261 
rows=4042442 loops=1)
                     ->  Seq Scan on lookup_weekly_20140503 n_7  
(cost=0.00..93516.49 rows=4118149 width=52) (actual time=0.006..678.281 
rows=4118149 loops=1)
                     ->  Seq Scan on lookup_weekly_20140329 n_8  
(cost=0.00..88100.78 rows=3874278 width=52) (actual time=0.003..635.296 
rows=3874278 loops=1)
Total runtime: 22995.361 ms
(27 rows)

Thanks,
Suya

Reply via email to