Re: [GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-22 Thread Erdmann, Markus @ Bellevue

On Jun 20, 2016, at 1:36 PM, David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:

​Please don't top-post.

Yes, using "[VACUUM] ANALYZE" is what was meant (the VACUUM doesn't hurt - and 
possibly helps - but wasn't required).  ANALYZE recomputes the statistics for 
your database.  The apparent problem was that those statistics were wrong which 
causes the planner to choose the wrong plan.

EXPLAIN ANALYZE doesn't help in that scenario - the explain and planner both 
end up using the bad statistics.  The addition of ANALYZE to EXPLAIN simply 
tells the system to not only explain your query but to execute it as well (but 
discard the results).  Aside from sharing the same 7 characters the two words 
have nothing in common.

I'm not sure how one would "...includ[e] ANALYZE in EXPLAIN ANALYZE" - you only 
get to explain one statement at a time.

David J.


Thank you, David. My confusion originated from a lack of familiarity with the 
ANALYZE command. Your responses and Tom’s response have been very enlightening.

The head-scratcher for us is that our statistics became so out of date even 
though we have the autovacuum daemon enabled in RDS, and according to the docs 
that does run ANALYZE periodically. Now we know (thanks to your help) to check 
for this issue immediately when the planner is showing a large disparity 
between the estimated and actual cost.

Markus E.


Re: [GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-20 Thread David G. Johnston
On Fri, Jun 17, 2016 at 5:05 PM, Erdmann, Markus @ Bellevue <
markus.erdm...@cbre.com> wrote:

> Thank you Tom and David for your very helpful replies. We dumped and
> restored the RDS staging database on a local installation of pg and were
> not able to reproduce the issue in 9.5.2, which led us to try running a
> VACUUM ANALYZE and recreating indexes. After this we no longer saw a
> discrepancy between the query plan in the separate environments. Is this
> what you meant, Tom, by making sure to ANALYZE? Or did you mean including
> ANALYZE in EXPLAIN ANALYZE?
>
>
​Please don't top-post.

Yes, using "[VACUUM] ANALYZE" is what was meant (the VACUUM doesn't hurt -
and possibly helps - but wasn't required).  ANALYZE recomputes the
statistics for your database.  The apparent problem was that those
statistics were wrong which causes the planner to choose the wrong plan.

EXPLAIN ANALYZE doesn't help in that scenario - the explain and planner
both end up using the bad statistics.  The addition of ANALYZE to EXPLAIN
simply tells the system to not only explain your query but to execute it as
well (but discard the results).  Aside from sharing the same 7 characters
the two words have nothing in common.

I'm not sure how one would "...includ[e] ANALYZE in EXPLAIN ANALYZE" - you
only get to explain one statement at a time.

David J.


Re: [GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-20 Thread Erdmann, Markus @ Bellevue
Thank you Tom and David for your very helpful replies. We dumped and
restored the RDS staging database on a local installation of pg and were
not able to reproduce the issue in 9.5.2, which led us to try running a
VACUUM ANALYZE and recreating indexes. After this we no longer saw a
discrepancy between the query plan in the separate environments. Is this
what you meant, Tom, by making sure to ANALYZE? Or did you mean including
ANALYZE in EXPLAIN ANALYZE?


On 6/17/16, 11:17 AM, "Tom Lane"  wrote:

>"Erdmann, Markus @ Bellevue"  writes:
>> We¹re trying to debug a performance issue affecting our staging
>>database, and we¹ve narrowed it down to a difference in the query
>>optimizer in 9.5.2. Upgrading to 9.5 is important for us because we need
>>the ability to import foreign schemas.
>
>I think the core of the problem is the large rowcount estimation error
>here:
>
>>   ->  Bitmap Index Scan on
>>transactions_transaction_c69e55a4  (cost=0.00..18.02 rows=161 width=0)
>>(actual time=20.153..20.153 rows=269021 loops=1)
>> Index Cond: ((date_created >=
>>'2010-01-01'::date) AND (date_created <= '2015-12-31'::date))
>
>That's a pretty simple condition and it's hard to believe that 9.5 does it
>much differently than 9.4 did.  Perhaps you forgot to ANALYZE, or were
>using a larger statistics target in the 9.4 installation?
>
>   regards, tom lane



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-17 Thread Tom Lane
"Erdmann, Markus @ Bellevue"  writes:
> We’re trying to debug a performance issue affecting our staging database, and 
> we’ve narrowed it down to a difference in the query optimizer in 9.5.2. 
> Upgrading to 9.5 is important for us because we need the ability to import 
> foreign schemas.

I think the core of the problem is the large rowcount estimation error
here:

>   ->  Bitmap Index Scan on 
> transactions_transaction_c69e55a4  (cost=0.00..18.02 rows=161 width=0) 
> (actual time=20.153..20.153 rows=269021 loops=1)
> Index Cond: ((date_created >= 
> '2010-01-01'::date) AND (date_created <= '2015-12-31'::date))

That's a pretty simple condition and it's hard to believe that 9.5 does it
much differently than 9.4 did.  Perhaps you forgot to ANALYZE, or were
using a larger statistics target in the 9.4 installation?

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regression in query optimizer when upgrading from 9.4.5 to 9.5.2?

2016-06-17 Thread David G. Johnston
On Thu, Jun 16, 2016 at 8:40 PM, Erdmann, Markus @ Bellevue <
markus.erdm...@cbre.com> wrote:

> Hello,
>
> We’re trying to debug a performance issue affecting our staging database,
> and we’ve narrowed it down to a difference in the query optimizer in 9.5.2.
> Upgrading to 9.5 is important for us because we need the ability to import
> foreign schemas.
>
> This is the query we’re running:
>
> CREATE TEMP TABLE tmp_joined_transactions_9gkgptn5xcp9 ( transaction_id
> integer PRIMARY KEY );
>
> INSERT INTO tmp_joined_transactions_9gkgptn5xcp9 (transaction_id)
>

​This...
​

> SELECT DISTINCT ON ("transactions_transaction"."id")
> "transactions_transaction"."id"
>

​​DISTINCT is a code smell.  DISTINCT ON less so - it helps to avoid
self-joins - but your inclusion of ON here is pointless since the only
output column is "id".

​As written there should be no way to get duplicate "id"s into the output
result.  Or, if the tmp_joined_transactions relationship is 1-to-many you
should instead use a semi-join instead of an inner join.

FROM "transactions_transaction" , "tmp_joined_transactions_75chlsokrsev"
> WHERE
>

​Here...​

(NOT ("transactions_transaction"."id"
> IN (SELECT U0."id" AS Col1
> FROM "transactions_transaction" U0
> LEFT OUTER JOIN "transactions_commission" U1
> ON ( U0."id" = U1."transaction_id" )
> WHERE U1."id" IS NULL))
>

​Making this an anti-join (NOT EXISTS instead of NOT IN) stands out as an
easy improvement to try:

​It also makes the logic clearer since you seem to have a double-negative
here which means you really want a semi-join (which I wrote below)

WHERE EXISTS (SELECT 1 FROM transactions_commission WHERE
transactions_transaction.id = transactions_commission.transaction_id)

I won't promise this gives the same answer...I don't have enough spare
brain power or the ability to test it...but its seems correct.

AND "transactions_transaction"."date_created" >= '2010-01-01'::date
> AND "transactions_transaction"."date_created" <= '2015-12-31'::date
> AND "transactions_transaction"."deal_status" IN (1)
>

​Also...
​


> AND (transactions_transaction.id =
> tmp_joined_transactions_75chlsokrsev.transaction_id))
>

​This is style but I'm really a fan of using ANSI JOIN syntax...turning the
above into a 

ORDER BY "transactions_transaction"."id" ASC;
>
>
​The regression itself someone else would need to comment on.

David J.