Hey Everyone, 

I seem to be getting an inaccurate cost from explain.  Here are two examples 
for one query with two different query plans:

exchange_prod=# set enable_nestloop = on;
SET
exchange_prod=#
exchange_prod=# explain analyze SELECT COUNT(DISTINCT "exchange_uploads"."id") 
FROM "exchange_uploads" INNER JOIN "upload_destinations" ON 
"upload_destinations"."id" = "exchange_uploads"."upload_destination_id" LEFT 
OUTER JOIN "uploads" ON "uploads"."id" = "exchange_uploads"."upload_id" LEFT 
OUTER JOIN "import_errors" ON "import_errors"."exchange_upload_id" = 
"exchange_uploads"."id" LEFT OUTER JOIN "exchanges" ON "exchanges"."id" = 
"upload_destinations"."exchange_id" WHERE (("exchange_uploads"."created_at" >= 
'2012-07-27 21:21:57.363944' AND "upload_destinations"."office_id" = 6));
                                                                                
            QUERY PLAN                                                          
                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=190169.54..190169.55 rows=1 width=4) (actual 
time=199806.806..199806.807 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..190162.49 rows=2817 width=4) (actual 
time=163.293..199753.548 rows=43904 loops=1)
         ->  Nested Loop  (cost=0.00..151986.53 rows=2817 width=4) (actual 
time=163.275..186869.844 rows=43904 loops=1)
               ->  Index Scan using upload_destinations_office_id_idx on 
upload_destinations  (cost=0.00..29.95 rows=4 width=8) (actual 
time=0.060..0.093 rows=6 loops=1)
                     Index Cond: (office_id = 6)
               ->  Index Scan using 
index_exchange_uploads_on_upload_destination_id on exchange_uploads  
(cost=0.00..37978.21 rows=875 width=12) (actual time=27.197..31140.375 
rows=7317 loops=6)
                     Index Cond: (upload_destination_id = 
upload_destinations.id)
                     Filter: (created_at >= '2012-07-27 
21:21:57.363944'::timestamp without time zone)
         ->  Index Scan using index_import_errors_on_exchange_upload_id on 
import_errors  (cost=0.00..8.49 rows=405 width=4) (actual time=0.291..0.291 
rows=0 loops=43904)
               Index Cond: (exchange_upload_id = exchange_uploads.id)
 Total runtime: 199806.951 ms
(11 rows)

exchange_prod=# 
exchange_prod=# set enable_nestloop = off;
SET
exchange_prod=# 
exchange_prod=# explain analyze SELECT COUNT(DISTINCT "exchange_uploads"."id") 
FROM "exchange_uploads" INNER JOIN "upload_destinations" ON 
"upload_destinations"."id" = "exchange_uploads"."upload_destination_id" LEFT 
OUTER JOIN "uploads" ON "uploads"."id" = "exchange_uploads"."upload_id" LEFT 
OUTER JOIN "import_errors" ON "import_errors"."exchange_upload_id" = 
"exchange_uploads"."id" LEFT OUTER JOIN "exchanges" ON "exchanges"."id" = 
"upload_destinations"."exchange_id" WHERE (("exchange_uploads"."created_at" >= 
'2012-07-27 21:21:57.363944' AND "upload_destinations"."office_id" = 6));
                                                                                
                QUERY PLAN                                                      
                                           
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2535992.33..2535992.34 rows=1 width=4) (actual 
time=133447.507..133447.507 rows=1 loops=1)
   ->  Hash Right Join  (cost=1816553.69..2535985.56 rows=2708 width=4) (actual 
time=133405.326..133417.078 rows=43906 loops=1)
         Hash Cond: (import_errors.exchange_upload_id = exchange_uploads.id)
         ->  Seq Scan on import_errors  (cost=0.00..710802.71 rows=2300471 
width=4) (actual time=0.006..19199.569 rows=2321888 loops=1)
         ->  Hash  (cost=1816519.84..1816519.84 rows=2708 width=4) (actual 
time=112938.606..112938.606 rows=43906 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1544kB
               ->  Hash Join  (cost=28.25..1816519.84 rows=2708 width=4) 
(actual time=42.957..112892.689 rows=43906 loops=1)
                     Hash Cond: (exchange_uploads.upload_destination_id = 
upload_destinations.id)
                     ->  Index Scan using 
index_upload_destinations_on_created_at on exchange_uploads  
(cost=0.00..1804094.96 rows=3298545 width=12) (actual time=17.686..111649.272 
rows=3303488 loops=1)
                           Index Cond: (created_at >= '2012-07-27 
21:21:57.363944'::timestamp without time zone)
                     ->  Hash  (cost=28.20..28.20 rows=4 width=8) (actual 
time=0.043..0.043 rows=6 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           ->  Bitmap Heap Scan on upload_destinations  
(cost=6.28..28.20 rows=4 width=8) (actual time=0.026..0.036 rows=6 loops=1)
                                 Recheck Cond: (office_id = 6)
                                 ->  Bitmap Index Scan on 
upload_destinations_office_id_idx  (cost=0.00..6.28 rows=4 width=0) (actual 
time=0.020..0.020 rows=6 loops=1)
                                       Index Cond: (office_id = 6)
 Total runtime: 133447.790 ms
(17 rows)



The first query shows a cost of 190,169.55 and runs in 199,806.951 ms.  When I 
disable nested loop, I get a cost of 2,535,992.34 which runs in only 
133,447.790 ms.  We have run queries on our database with a cost of 200K cost 
before and they ran less then a few seconds, which makes me wonder if the first 
query plan is inaccurate.  The other issue is understanding why a query plan 
with a much higher cost is taking less time to run.

I do not think these queries are cached differently, as we have gotten the same 
results ran a couple of times at across a few days.  We also analyzed the 
tables that we are querying before trying the explain analyze again, and were 
met with the same statistics.  Any help on how Postgres comes up with a query 
plan like this, and why there is a difference would be very helpful.

Thanks! 

-- 
Robert Sosinski

Reply via email to