At Tue, 08 Jan 2019 16:26:38 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI
<[email protected]> wrote in
<[email protected]>
> Hello.
>
> At Fri, 21 Dec 2018 11:50:28 -0500, Tom Lane <[email protected]> wrote in
> <[email protected]>
> > seem that that's just moving the problem around, but I think it
> > might be possible to show that such a value couldn't be computed
> > by scalarltsel given a histogram with no more than 10000 members.
> > (I haven't tried to actually prove that, but it seems intuitive
> > that the set of possible results would be quantized with no more
> > than about 5 digits precision.)
I think we don't need a perfect proof for that. The fact that
exactly 1/3 is quite natural and common but 1/3 + ε is not would
be enough.
> FWIW, I got the following result on my environment. It seems
> different enough if this holds on all supported platforms, though
> there still is a case where the result of a sequence of
> arithmetics makes false match.
Simple selectivity of a relation theoretically cannot match with
the epsilon. (Of couse on *my* environment.)
(0.333..)
binary format: 3f d5 55 55 55 55 55 55
x = 0.333333333333333315
231 matches, 79 no_matches
(0.3{13}42..)
binary format: 3f d5 55 55 55 55 55 f1
x = 0.333333333333341975
0 matches, 310 no_matches
(0.3{15}42..)
binary format: 3f d5 55 55 55 55 55 57
x = 0.333333333333333426
0 matches, 310 no_matches
It seems that 0.3{13}42 is correctly 0.3{15}42, which makes just
two LSBs difference from 1/3. I believe C is well standardized on
the translation. Other DEFAULT_*_SELs are not compared in this
way.
The attached small patch fixes the case mentioned in this thread,
but I'm not sure where to put a test. Static assertion is not
usable. Assertion in somewhere perhaps in clauselist_selectivity
seems somewhat overdone.. I don't find a suitable place in
regression test..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
#include <stdio.h>
#include <math.h>
int test(double x)
{
double d = 1.0;
double d0 = 0;
unsigned char *c_x = (unsigned char *) &x;
int nmatches = 0;
int nnomatches = 0;
int i;
fprintf(stderr, "binary format: ");
for (i = 7 ; i >= 0 ; i--)
fprintf(stderr, "%s%02x", i < 7 ? " " : "", c_x[i]);
fprintf(stderr, "\n");
fprintf(stderr, "x = %20.18f\n", x);
while (d != d0)
{
double z = floor(d * 3);
double z1 = z + 1.0;
double y = d / z;
double y1 = d / z1;
/* Check if both sides of d * 3 doesn't make match */
if (y == x || y1 == x)
nmatches++;
else
nnomatches++;
d0 = d;
d = d * 10;
}
fprintf(stderr, " %d matches, %d no_matches\n", nmatches, nnomatches);
}
int main(void)
{
test(0.3333333333333333);
test(0.333333333333342);
test(0.33333333333333342);
return 0;
}
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 3739b9817a..cdeaac22c8 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -109,7 +109,7 @@ clauselist_selectivity(PlannerInfo *root,
ListCell *l;
int listidx;
- /*
+ /*
* If there's exactly one clause, just go directly to
* clause_selectivity(). None of what we might do below is relevant.
*/
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 5cc4cf15e2..15a8d2402a 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -33,8 +33,12 @@
/* default selectivity estimate for equalities such as "A = b" */
#define DEFAULT_EQ_SEL 0.005
-/* default selectivity estimate for inequalities such as "A < b" */
-#define DEFAULT_INEQ_SEL 0.3333333333333333
+/*
+ * default selectivity estimate for inequalities such as "A < b"
+ * The last two digits prevent it from making a false match with 1/3 computed
+ * from histogram and/or MCV.
+ */
+#define DEFAULT_INEQ_SEL 0.33333333333333342
/* default selectivity estimate for range inequalities "A > b AND A < c" */
#define DEFAULT_RANGE_INEQ_SEL 0.005