On Mon, Oct 21, 2013 at 1:36 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Oct 21, 2013 at 4:01 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Gavin Flower <gavinflo...@archidevsys.co.nz> writes: > >>>> If we're going to extend pg_stat_statements, even more than min and > max > >>>> I'd like to see the standard deviation in execution time. > > > >> How about the 'median', often a lot more useful than the 'arithmetic > >> mean' (which most people call the 'average'). > > > > AFAIK, median is impossible to calculate cheaply (in particular, with > > a fixed amount of workspace). So this apparently innocent request > > is actually moving the goalposts a long way, because the space per > > query table entry is a big concern for pg_stat_statements. > > Yeah, and I worry about min and max not being very usable - once they > get pushed out to extreme values, there's nothing to drag them back > toward normality except resetting the stats, and that's not something > we want to encourage people to do frequently. Of course, averages over > very long sampling intervals may not be too useful anyway, dunno. > I think the pg_stat_statements_reset() should be done every time you make a change which you think (or hope) will push the system into a new regime, which goes for either min/max or for average/stdev. A histogram would be cool, but it doesn't seem very practical to implement. If I really needed that I'd probably set log_min_duration_statement = 0 and mine the log files. But that means I'd have to wait to accumulate enough logs once I made that change, then remember to turn it off. What I'd like most in pg_stat_statements now is the ability to distinguish which queries have a user grinding their teeth, versus which ones have a cron job patiently doing a wait4. I don't know the best way to figure that out, other than stratify on application_name. Or maybe a way to selectively undo the query text normalization, so I could see which parameters were causing the problem. Cheers, Jeff