On Tue, Jan 17, 2012 at 12:04:06PM +0400, Alexander Korotkov wrote: > Thanks for your fixes to the patch. Them looks correct to me. I did some > fixes in the patch. The proof of some concepts is still needed. I'm going > to provide it in a few days.
Your further fixes look good. Could you also answer my question about the header comment of mcelem_array_contained_selec()? /* * Estimate selectivity of "column <@ const" based on most common element * statistics. Independent element occurrence would imply a particular * distribution of distinct element counts among matching rows. Real data * usually falsifies that assumption. For example, in a set of 1-element * integer arrays having elements in the range [0;10], element occurrences are * not independent. If they were, a sufficiently-large set would include all * distinct element counts 0 through 11. We correct for this using the * histogram of distinct element counts. * * In the "column @> const" and "column && const" cases, we usually have * "const" with low summary frequency of elements (otherwise we have * selectivity close to 0 or 1 correspondingly). That's why the effect of * dependence related to distinct element counts distribution is negligible * there. In the "column <@ const" case, summary frequency of elements is * high (otherwise we have selectivity close to 0). That's why we should do * correction due to array distinct element counts distribution. */ By "summary frequency of elements", do you mean literally P_0 + P_1 ... + P_N? If so, I can follow the above argument for "column && const" and "column <@ const", but not for "column @> const". For "column @> const", selectivity cannot exceed the smallest frequency among const elements. A number of high-frequency elements will drive up the sum of the frequencies without changing the true selectivity much at all. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers