On 24/10/13 12:14, Jeff Janes wrote:
On Wed, Oct 23, 2013 at 4:00 PM, Gavin Flower
<gavinflo...@archidevsys.co.nz <mailto:gavinflo...@archidevsys.co.nz>>
wrote:
On 24/10/13 11:26, Peter Geoghegan wrote:
On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower
<gavinflo...@archidevsys.co.nz
<mailto:gavinflo...@archidevsys.co.nz>> wrote:
Looks definitely bimodal in the log version, very clear!
Yes, I feel that having a 32 log binary binned histogram
(as Alvaro Herrera
suggested) would be very useful.
I'm having a hard time imagining how you'd actually implement
that.
For example, this:
https://wiki.postgresql.org/wiki/Aggregate_Histogram
requires that a "limit" be specified ahead of time. Is there a
principled way to increase or decrease this kind of limit over
time,
and have the new buckets contents "spill into each other"?
To smplify things, I'm using 5 buckets, but 32 would be better.
Assume first bucket width is 1ms.
bucket range
0 x =< 1ms
1 1ms < x =< 2ms
2 2ms < x =< 4ms
3 4ms < x =< 8ms
5 8ms < x
The last bucket would be limited to 8ms < x <= 16 ms. If you find
something > 16ms, then you have to rescale *before* you increment any
of the buckets. Once you do, there is now room to hold it.
bucket range
0 x =< 2ms (sum of previous bucket 0 and previous
bucket 1)
1 2ms < x =< 4ms
2 4ms < x =< 8ms
3 8ms < x =< 16ms
4 16ms < x =< 32ms (starts empty)
Cheers,
Jeff
It is very important that the last bucket be unbounded, or you may lose
potentially important data. Especially if one asumes that all durations
will fit into the first n - 1 buckets, in which case being alerted to
the asumption being siginificantly wrong is crucial!
The logic to check on the values for the last bucket is trivial, so
there is no need to have an upper limit for it.
Cheers,
Gavin