On Mon, Oct 1, 2012 at 3:22 AM, Jeff Davis <pg...@j-davis.com> wrote:
> On Tue, 2012-09-04 at 17:27 +0400, Alexander Korotkov wrote: > > Addon patch is attached. Actually, I don't get your intention of > > introducing STATISTIC_KIND_RANGE_EMPTY_FRAC stakind. Did you plan to > > leave it as empty frac in distinct stakind or replace this stakind > > with STATISTIC_KIND_LENGTH_HISTOGRAM? In the attached > > patch STATISTIC_KIND_RANGE_EMPTY_FRAC is replaced > > with STATISTIC_KIND_LENGTH_HISTOGRAM. > > Review comments: > > 1. In compute_range_stats, you need to guard against the case where > there is no subdiff function. Perhaps default to 1.0 or something? > Let it be 1.0 without subdiff function. However, there is not so much use of this method of estimation without subdiff. But, probably it's better than nothing. 2. I think it would be helpful to add comments regarding what happens > when lengths are identical, right now it's a little confusing. For > instance, the comment: "Generate a length histogram slot entry if there > are at least two length values" doesn't seem right, because the > condition still matches even if there is only one distinct value. > I've rephrased comment. Not it's implicitly says that collected values are not necessary distinct. > 3. It looks like get_distance also needs to guard against a missing > subdiff. > Same to compute_range_stats. Let default value be 1.0. > 4. There are 3 binary search functions, which seems a little excessive: > * rbound_bsearch: greatest i such that hist[i] < v; or -1 > * rbound_bsearch_equal: greatest i such that: > hist[i] <= v and (i=0 or hist[i-1] != hist[i]); or -1 > * length_hist_bsearch: least i such that hist[i] >= v; > or length of hist > (let me know if I misunderstood the definitions) > At a minimum, we need more consistent and informative names. Also, the > definition of rbound_bsearch_equal is a little confusing because it's > looking for the highest index among distinct values, but the lowest > index among identical values. Do you see a way to refactor these to be a > little easier to understand? > Actually, goal of rbound_bsearch_equal is to find histogram bin to start interpolation from. I've renamed it to rbound_bsearch_bin and added corresponding comment. ------ With best regards, Alexander Korotkov.
range_stat-0.8.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers