On 10/21/2013 07:29 PM, Tom Lane wrote:
Andrew Dunstan <and...@dunslane.net> writes:
This is why I suggested the standard deviation, and why I find it would
be more useful than just min and max. A couple of outliers will set the
min and max to possibly extreme values but hardly perturb the standard
deviation over a large number of observations.
Hm.  It's been a long time since college statistics, but doesn't the
entire concept of standard deviation depend on the assumption that the
underlying distribution is more-or-less normal (Gaussian)?  Is there a
good reason to suppose that query runtime is Gaussian?  (I'd bet not;
in particular, multimodal behavior seems very likely due to things like
plan changes.)  If not, how much does that affect the usefulness of
a standard-deviation calculation?


IANA statistician, but the article at <https://en.wikipedia.org/wiki/Standard_deviation> appears to have a diagram with one sample that's multi-modal.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to