All,

Bringing this down to Earth:  yes, it would be useful to have min and
max for pg_stat_statement, and even if we add more stats to
pg_stat_statement, it would be useful to have those two.  So can we
approve this patch on that basis?

For my part, I generally use the 9-part percentiles for query analysis
(0,5,10,25,50,75,90,95,100).  However, that's fairly expensive to
calculate, and would require a histogram or other approaches mentioned
earlier.

On 10/22/2013 11:16 AM, Jeff Janes wrote:
> It is easy to misinterpret the standard deviation if the distribution is
> not gaussian, but that is also true of the average.  The standard deviation
> (or the variance) is commonly used with non-gaussian distributions, either
> because it is the most efficient estimator for those particular
> distributions, or just because it is so commonly available.

On the other hand, it's still true that a high STDDEV indicates a high
variance in the response times of a particular query, whereas a low one
indicates that most are close to the average.  While precision math
might not work if we don't have the correct distribution, for gross DBA
checks it's still useful.  That is, I can answer the question in many
cases of: "Does this query have a high average because of outliers, or
because it's consisently slow?" by looking at the STDDEV.

And FWIW, for sites where we monitor pg_stat_statements, we reset daily
or weekly.  Otherwise, the stats have no meaning.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to