RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs

Godfrin, Philippe E Mon, 13 Dec 2021 08:35:06 -0800

>
>
>From: Peter J. Holzer hjp-pg...@hjp.at<mailto:hjp-pg...@hjp.at>
>Sent: Friday, December 10, 2021 3:43 PM
>To: 
>pgsql-general@lists.postgresql.org<mailto:pgsql-general@lists.postgresql.org>
>Subject: Re: [EXTERNAL] Re: performance expectations for table(s) with 2B recs
>
>On 2021-12-10 18:04:07 +0000, Godfrin, Philippe E wrote:
>> >But in my experience the biggest problem with large tables are unstable
>> >execution plans - for most of the parameters the optimizer will choose
>> >to use an index, but for some it will erroneously think that a full
>> >table scan is faster. That can lead to a situation where a query
>> >normally takes less than a second, but sometimes (seemingly at random)
>> >it takes several minutes
>[...]
>> For Peter I have a question. What exactly causes ‘unstable execution plans’ 
>> ??
>>
>> Besides not using bind variables, bad statistics, would you elaborate
>> in what would contribute to that instability?
>
>Not using bind variables and bad statistics are certainly big factors:
>
>On one hand not using bind variables gives a lot more information to the
>optimizer, so it can choose a better plan at run time. On the other hand
>that makes hard to predict what plan it will choose.
>
>Bad statistics come in many flavours: They might just be wrong, that's
>usually easy to fix. More problematic are statistics which just don't
>describe reality very well - they may not show a correlation, causing
>the optimizer to assume that two distributions are independent when they
>really aren't (since PostgreSQL 10 you can create statistics on multiple
>columns which helps in many but not all cases) or not show some other
>peculiarity of the data. Or they may be just so close to a flipping
>point that a small change causes the optimizer to choose a wildly
>different plan.
>
>Another source is dynamically generated SQL. Your application may just
>put together SQL from fragments or it might use something like
>SQLalchemy or an ORM. In any of these cases what looks like one query
>from a user's perspective may really be a whole family of related
>queries - and PostgreSQL will try to find the optimal plan for each of
>them. Which is generally a good thing, but it adds opportunities to mess
>up.
>
>hp
>
>--
>_ | Peter J. Holzer | Story must make more sense than reality.
>|_|_) | |
>| | | h...@hjp.at<mailto:h...@hjp.at> | -- Charles Stross, "Creative writing
>__/ | http://www.hjp.at/ | challenge!"


Good answer Peter, I agree wholeheartedly. I was curious if there was something 
specific to Postgresql .
phil

RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs

Reply via email to