On Aug 5, 2014, at 12:45 PM, Shaun Thomas <[email protected]> wrote:
> Your EXPLAIN output basically answered this for you. Your fast query has this:
>
>> Nested Loop (cost=0.85..2696.12 rows=88 width=1466)
>
> While your slow one has this:
>
>> Hash Join (cost=292249.24..348608.93 rows=28273 width=1466)
>
> If this data is at all accurate (run an EXPLAIN ANALYZE to get the real
> numbers), the second query will return about 300x more rows than the first
> one. If your first query takes 5s, 20 minutes for the second isn't beyond the
> realm of possibility. The planner changed approaches because an efficient
> technique to join 88 rows is much different than what's needed for 300 times
> as many.
>
> Given that, five seconds for 88 rows is astonishingly slow, even for hstore.
> I'd be willing to bet something else is going on here.
Even on a 114G table with a 16G index, you would consider this slow? (physical
disk space is closer to 800G, that was our high-water before removing lots of
rows and vacuuming, although it is running on SSD)
Re-running the query always makes it much faster, but these are generally just
run once, and the duration log says the fast queries usually take between 2s -
5s the first time. (I'm assuming that's because we have a very large
effective_cache_size...?)
The explain analyze finished and suggests the initial row count was correct:
db=# explain analyze SELECT *, testruns.id FROM testruns JOIN test_types ON
testruns.test_type_id=test_types.id WHERE testruns.custom_data->'SpawnID' =
'SpawnID-428870395.258592' AND testruns.started_at > '2014-08-03
18:39:55.258592';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=292303.24..348662.93 rows=28273 width=1466) (actual
time=23875.727..824373.654 rows=335032 loops=1)
Hash Cond: (testruns.test_type_id = test_types.id)
-> Bitmap Heap Scan on testruns (cost=291564.31..347323.21 rows=28273
width=1399) (actual time=23868.077..823271.058 rows=335032 loops=1)
Recheck Cond: (((custom_data -> 'SpawnID'::text) =
'SpawnID-428870395.258592'::text) AND (started_at > '2014-08-03
18:39:55.258592'::timestamp without time zone))
Rows Removed by Index Recheck: 1867456
-> BitmapAnd (cost=291564.31..291564.31 rows=28273 width=0) (actual
time=23843.870..23843.870 rows=0 loops=1)
-> Bitmap Index Scan on index_testruns_on_custom_spawnid
(cost=0.00..41437.84 rows=500170 width=0) (actual time=4872.404..4872.404
rows=2438520 loops=1)
Index Cond: ((custom_data -> 'SpawnID'::text) =
'SpawnID-428870395.258592'::text)
-> Bitmap Index Scan on index_testruns_on_started_at_2
(cost=0.00..250112.08 rows=3188736 width=0) (actual time=18679.875..18679.875
rows=5822899 loops=1)
Index Cond: (started_at > '2014-08-03
18:39:55.258592'::timestamp without time zone)
-> Hash (cost=563.41..563.41 rows=14041 width=67) (actual
time=7.635..7.635 rows=14215 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 1380kB
-> Seq Scan on test_types (cost=0.00..563.41 rows=14041 width=67)
(actual time=0.004..2.639 rows=14215 loops=1)
Total runtime: 824471.902 ms
(14 rows)
Time: 824473.429 ms
~ john