[GENERAL] understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr

john gale Tue, 05 Aug 2014 12:37:05 -0700

I would be appreciative if somebody could help explain why we have two nearly 
identical queries taking two different planner routes;  one a nested index loop 
that takes about 5s to complete, and the other a hash join & heap scan that 
takes about 2hr.  This is using Postgres 9.3.3 on OS X 10.9.4.


These two queries are different only in the value of an hstore entry and a date 
entry:

SELECT *, testruns.id FROM testruns JOIN test_types ON 
testruns.test_type_id=test_types.id WHERE testruns.custom_data->'SpawnID' = 
'SpawnID-428842195.338828' AND testruns.started_at > '2014-08-03 
10:49:55.338828';
SELECT *, testruns.id FROM testruns JOIN test_types ON 
testruns.test_type_id=test_types.id WHERE testruns.custom_data->'SpawnID' = 
'SpawnID-428870395.258592' AND testruns.started_at > '2014-08-03 
18:39:55.258592';

We have built an index for the hstore entry for SpawnID and most queries use it 
correctly.

      Column      |            Type             |                       
Modifiers                       
------------------+-----------------------------+-------------------------------------------------------
 id               | integer                     | not null default 
nextval('testruns_id_seq'::regclass)
 started_at       | timestamp without time zone | not null
...
Indexes:
...
    "index_testruns_on_custom_spawnid" btree ((custom_data -> 'SpawnID'::text))

However after realizing that these two queries differed in execution time so 
drastically, an explain showed that they are taking two very different paths:

db=# explain SELECT *, testruns.id FROM testruns JOIN test_types ON 
testruns.test_type_id=test_types.id WHERE testruns.custom_data->'SpawnID' = 
'SpawnID-428842195.338828' AND testruns.started_at > '2014-08-03 
10:49:55.338828';
                                                  QUERY PLAN                    
                              
--------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.85..2696.12 rows=88 width=1466)
   ->  Index Scan using index_testruns_on_custom_spawnid on testruns  
(cost=0.57..2378.40 rows=88 width=1399)
         Index Cond: ((custom_data -> 'SpawnID'::text) = 
'SpawnID-428842195.338828'::text)
         Filter: (started_at > '2014-08-03 10:49:55.338828'::timestamp without 
time zone)
   ->  Index Scan using test_types_pkey on test_types  (cost=0.29..3.60 rows=1 
width=67)
         Index Cond: (id = testruns.test_type_id)
(6 rows)

Time: 22.363 ms
db=# explain SELECT *, testruns.id FROM testruns JOIN test_types ON 
testruns.test_type_id=test_types.id WHERE testruns.custom_data->'SpawnID' = 
'SpawnID-428870395.258592' AND testruns.started_at > '2014-08-03 
18:39:55.258592';
                                                                                
 QUERY PLAN                                                                     
            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=292249.24..348608.93 rows=28273 width=1466)
   Hash Cond: (testruns.test_type_id = test_types.id)
   ->  Bitmap Heap Scan on testruns  (cost=291510.31..347269.21 rows=28273 
width=1399)
         Recheck Cond: (((custom_data -> 'SpawnID'::text) = 
'SpawnID-428870395.258592'::text) AND (started_at > '2014-08-03 
18:39:55.258592'::timestamp without time zone))
         ->  BitmapAnd  (cost=291510.31..291510.31 rows=28273 width=0)
               ->  Bitmap Index Scan on index_testruns_on_custom_spawnid  
(cost=0.00..41383.84 rows=500170 width=0)
                     Index Cond: ((custom_data -> 'SpawnID'::text) = 
'SpawnID-428870395.258592'::text)
               ->  Bitmap Index Scan on index_testruns_on_started_at_2  
(cost=0.00..250112.08 rows=3188736 width=0)
                     Index Cond: (started_at > '2014-08-03 
18:39:55.258592'::timestamp without time zone)
   ->  Hash  (cost=563.41..563.41 rows=14041 width=67)
         ->  Seq Scan on test_types  (cost=0.00..563.41 rows=14041 width=67)
(11 rows)

Time: 1.231 ms


Mainly, I don't understand why two queries constructed the same but with subtle 
value differences would take two different paths through the analyzer.  We 
haven't fiddled with the default planner options, aside from random_page_cost 
and effective_cache_size to match our hardware.  Re-analyzing the table does 
not seem to change the cost.  The index size is somewhat large (16G for a 114G 
table), and I haven't tried rebuilding it yet.

thanks,

        ~ john


-- 
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr

Reply via email to