I wrote: > No, the thing that is bothering me is why it seems to be correct to > apply a positive correction for ">=", a negative correction for "<", > and no correction for "<=" or ">". That seems weird and I can't > construct a plausible explanation for it.
After further thought, I can put a little more clarity to this, but it's still not really resolved. It's easily shown by experiment that the existing code correctly computes the probability that "x <= p" where p is the given probe value. It uses that value as-is for the < and <= cases, and 1 minus that value for > and >=. From this statement, it's clear why the above is the right way to correct matters. What I find remarkable is that this is what the loop computes regardless of which of the four operators is used to probe, and regardless of whether the probe value p is exactly equal to some histogram boundary value. That doesn't seem intuitive at all --- when p does match a histogram entry, you'd think it would matter which operator you probe with. (Pokes at it some more...) Oh, interesting: it behaves that way except when p is exactly the lowest histogram entry. The code is probably blowing off that edge case without enough thought. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers