On 12/07/2014 03:54 AM, Tomas Vondra wrote:
The one interesting case is the 'step skew' with statistics_target=10,
i.e. estimates based on mere 3000 rows. In that case, the adaptive
estimator significantly overestimates:
values current adaptive
------------------------------
106 99 107
106 8 6449190
1006 38 6449190
10006 327 42441
I don't know why I didn't get these errors in the previous runs, because
when I repeat the tests with the old patches I get similar results with
a 'good' result from time to time. Apparently I had a lucky day back
then :-/
I've been messing with the code for a few hours, and I haven't found any
significant error in the implementation, so it seems that the estimator
does not perform terribly well for very small samples (in this case it's
3000 rows out of 10.000.000 (i.e. ~0.03%).
The paper [1] gives an equation for an upper bound of the error of this
GEE estimator. How do the above numbers compare with that bound?
[1]
http://ftp.cse.buffalo.edu/users/azhang/disc/disc01/cd1/out/papers/pods/towardsestimatimosur.pdf
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers