Re: [GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-22 Thread Erdmann, Markus @ Bellevue

On Jun 20, 2016, at 1:36 PM, David G. Johnston 
> wrote:

​Please don't top-post.

Yes, using "[VACUUM] ANALYZE" is what was meant (the VACUUM doesn't hurt - and 
possibly helps - but wasn't required).  ANALYZE recomputes the statistics for 
your database.  The apparent problem was that those statistics were wrong which 
causes the planner to choose the wrong plan.

EXPLAIN ANALYZE doesn't help in that scenario - the explain and planner both 
end up using the bad statistics.  The addition of ANALYZE to EXPLAIN simply 
tells the system to not only explain your query but to execute it as well (but 
discard the results).  Aside from sharing the same 7 characters the two words 
have nothing in common.

I'm not sure how one would "...includ[e] ANALYZE in EXPLAIN ANALYZE" - you only 
get to explain one statement at a time.

David J.


Thank you, David. My confusion originated from a lack of familiarity with the 
ANALYZE command. Your responses and Tom’s response have been very enlightening.

The head-scratcher for us is that our statistics became so out of date even 
though we have the autovacuum daemon enabled in RDS, and according to the docs 
that does run ANALYZE periodically. Now we know (thanks to your help) to check 
for this issue immediately when the planner is showing a large disparity 
between the estimated and actual cost.

Markus E.


Re: [GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-20 Thread Erdmann, Markus @ Bellevue
Thank you Tom and David for your very helpful replies. We dumped and
restored the RDS staging database on a local installation of pg and were
not able to reproduce the issue in 9.5.2, which led us to try running a
VACUUM ANALYZE and recreating indexes. After this we no longer saw a
discrepancy between the query plan in the separate environments. Is this
what you meant, Tom, by making sure to ANALYZE? Or did you mean including
ANALYZE in EXPLAIN ANALYZE?


On 6/17/16, 11:17 AM, "Tom Lane" <t...@sss.pgh.pa.us> wrote:

>"Erdmann, Markus @ Bellevue" <markus.erdm...@cbre.com> writes:
>> We¹re trying to debug a performance issue affecting our staging
>>database, and we¹ve narrowed it down to a difference in the query
>>optimizer in 9.5.2. Upgrading to 9.5 is important for us because we need
>>the ability to import foreign schemas.
>
>I think the core of the problem is the large rowcount estimation error
>here:
>
>>   ->  Bitmap Index Scan on
>>transactions_transaction_c69e55a4  (cost=0.00..18.02 rows=161 width=0)
>>(actual time=20.153..20.153 rows=269021 loops=1)
>> Index Cond: ((date_created >=
>>'2010-01-01'::date) AND (date_created <= '2015-12-31'::date))
>
>That's a pretty simple condition and it's hard to believe that 9.5 does it
>much differently than 9.4 did.  Perhaps you forgot to ANALYZE, or were
>using a larger statistics target in the 9.4 installation?
>
>   regards, tom lane



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-17 Thread Erdmann, Markus @ Bellevue
Hello,

We’re trying to debug a performance issue affecting our staging database, and 
we’ve narrowed it down to a difference in the query optimizer in 9.5.2. 
Upgrading to 9.5 is important for us because we need the ability to import 
foreign schemas.

This is the query we’re running:

CREATE TEMP TABLE tmp_joined_transactions_9gkgptn5xcp9 ( transaction_id integer 
PRIMARY KEY );

INSERT INTO tmp_joined_transactions_9gkgptn5xcp9 (transaction_id)
SELECT DISTINCT ON ("transactions_transaction"."id") 
"transactions_transaction"."id"
FROM "transactions_transaction" , "tmp_joined_transactions_75chlsokrsev"
WHERE (NOT ("transactions_transaction"."id"
IN (SELECT U0."id" AS Col1
FROM "transactions_transaction" U0
LEFT OUTER JOIN "transactions_commission" U1
ON ( U0."id" = U1."transaction_id" )
WHERE U1."id" IS NULL))
AND "transactions_transaction"."date_created" >= '2010-01-01'::date
AND "transactions_transaction"."date_created" <= '2015-12-31'::date
AND "transactions_transaction"."deal_status" IN (1)
AND (transactions_transaction.id = 
tmp_joined_transactions_75chlsokrsev.transaction_id))
ORDER BY "transactions_transaction"."id" ASC;

And this is the EXPLAIN ANALYZE output in 9.5.2:

Insert on tmp_joined_transactions_9gkgptn5xcp9  (cost=42111.29..42991.60 rows=1 
width=4) (actual time=470236.029..470236.029 rows=0 loops=1)
  ->  Unique  (cost=42111.29..42991.59 rows=1 width=4) (actual 
time=1109.636..470222.609 rows=4652 loops=1)
->  Nested Loop  (cost=42111.29..42991.59 rows=1 width=4) (actual 
time=1109.635..470219.259 rows=4652 loops=1)
  Join Filter: (transactions_transaction.id = 
tmp_joined_transactions_75chlsokrsev.transaction_id)
  Rows Removed by Join Filter: 1153472704
  ->  Index Only Scan using 
tmp_joined_transactions_75chlsokrsev_pkey on 
tmp_joined_transactions_75chlsokrsev  (cost=0.28..224.61 rows=5355 width=4) 
(actual time=0.016..4.829 rows=4652 loops=1)
Heap Fetches: 4652
  ->  Materialize  (cost=42111.01..42686.66 rows=1 width=4) (actual 
time=0.240..51.639 rows=247953 loops=4652)
->  Bitmap Heap Scan on transactions_transaction  
(cost=42111.01..42686.65 rows=1 width=4) (actual time=1109.580..1238.034 
rows=247953 loops=1)
  Recheck Cond: ((date_created >= '2010-01-01'::date) 
AND (date_created <= '2015-12-31'::date))
  Filter: ((NOT (hashed SubPlan 1)) AND (deal_status = 
1))
  Rows Removed by Filter: 21068
  Heap Blocks: exact=8073
  ->  Bitmap Index Scan on 
transactions_transaction_c69e55a4  (cost=0.00..18.02 rows=161 width=0) (actual 
time=20.153..20.153 rows=269021 loops=1)
Index Cond: ((date_created >= 
'2010-01-01'::date) AND (date_created <= '2015-12-31'::date))
  SubPlan 1
->  Hash Right Join  (cost=8799.57..42092.98 rows=1 
width=4) (actual time=852.012..1086.671 rows=3395 loops=1)
  Hash Cond: (u1.transaction_id = u0.id)
  Filter: (u1.id IS NULL)
  Rows Removed by Filter: 995248
  ->  Seq Scan on transactions_commission u1  
(cost=0.00..27020.63 rows=1039763 width=8) (actual time=0.003..221.750 
rows=1039763 loops=1)
  ->  Hash  (cost=8395.92..8395.92 rows=32292 
width=4) (actual time=153.984..153.984 rows=272663 loops=1)
Buckets: 131072 (originally 32768)  
Batches: 4 (originally 1)  Memory Usage: 3425kB
->  Seq Scan on 
transactions_transaction u0  (cost=0.00..8395.92 rows=32292 width=4) (actual 
time=0.003..71.988 rows=272663 loops=1)
Planning time: 0.410 ms
Execution time: 470237.400 ms

And this is the EXPLAIN ANALYZE output in 9.4.5:

Insert on tmp_joined_transactions_9gkgptn5xcp9  (cost=88544.31..88576.62 
rows=2154 width=4) (actual time=1356.065..1356.065 rows=0 loops=1)
  ->  Unique  (cost=88544.31..88555.08 rows=2154 width=4) (actual 
time=1347.480..1350.548 rows=4715 loops=1)
->  Sort  (cost=88544.31..88549.70 rows=2154 width=4) (actual 
time=1347.477..1348.432 rows=4715 loops=1)
  Sort Key: transactions_transaction.id
  Sort Method: quicksort  Memory: 414kB
  ->  Hash Join  (cost=72700.01..88425.06 rows=2154 width=4) 
(actual time=1107.077..1345.650 rows=4715 loops=1)
Hash Cond: (transactions_transaction.id = 
tmp_joined_transactions_75chlsokrsev.transaction_id)
->  Seq Scan on transactions_transaction  
(cost=72565.61..87199.11 rows=122287 width=4) (actual time=1104.855..1269.783 
rows=251736 loops=1)
  Filter: ((NOT (hashed SubPlan 1)) AND (date_created 
>= '2010-01-01'::date) AND (date_created <= '2015-12-31'::date) AND