On 12/07/2014 03:54 AM, Tomas Vondra wrote:
The one interesting case is the 'step skew' with statistics_target=10,
i.e. estimates based on mere 3000 rows. In that case, the adaptive
estimator significantly overestimates:

     values   current    adaptive
     ------------------------------
     106           99         107
     106            8     6449190
     1006          38     6449190
     10006        327       42441

I don't know why I didn't get these errors in the previous runs, because
when I repeat the tests with the old patches I get similar results with
a 'good' result from time to time. Apparently I had a lucky day back
then :-/

I've been messing with the code for a few hours, and I haven't found any
significant error in the implementation, so it seems that the estimator
does not perform terribly well for very small samples (in this case it's
3000 rows out of 10.000.000 (i.e. ~0.03%).

The paper [1] gives an equation for an upper bound of the error of this GEE estimator. How do the above numbers compare with that bound?

[1] http://ftp.cse.buffalo.edu/users/azhang/disc/disc01/cd1/out/papers/pods/towardsestimatimosur.pdf

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to