> > I've been reading something about implementation of histograms, and, > > AFAIK, in practice histograms is just a cool name for no more than: > > 1. top ten with frequency for each > > 2. the same for top ten worse > > 3. average for the rest Consider, that we only need that info for choice of index, and if an average value was too frequent for this index to be efficient you can safely drop the index, it would be useless. Thus it seems to me that keeping stats on the most infrequent values (point 2) is useless. For me these would also be the most volatile, thus the stats would only be accurate for a short period of time. I think what we need is as follows: 1. our current histograms 2. a list of exceptions for exceptional values that are very frequent Exceptional are those values that would skew the distribution too much. Very infrequent values should not be used for min|max values of histogram buckets, but that is imho all that needs to be done for infrequent values. Andreas