Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

Markus Schaber Tue, 03 May 2005 09:41:50 -0700

Hi, Josh,

Josh Berkus wrote:


> Yes, actually.   We need 3 different estimation methods:
> 1 for tables where we can sample a large % of pages (say, >= 0.1)
> 1 for tables where we sample a small % of pages but are "easily estimated"
> 1 for tables which are not easily estimated by we can't afford to sample a 
> large % of pages.
> 
> If we're doing sampling-based estimation, I really don't want people to lose 
> sight of the fact that page-based random sampling is much less expensive than 
> row-based random sampling.   We should really be focusing on methods which 
> are page-based.

Would it make sense to have a sample method that scans indices? I think
that, at least for tree based indices (btree, gist), rather good
estimates could be derived.

And the presence of a unique index should lead to 100% distinct values
estimation without any scan at all.

Markus


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

Reply via email to