Re: [HACKERS] default_statistics_target WAS: max_wal_senders must die

Josh Berkus Wed, 20 Oct 2010 15:15:57 -0700

>> Maybe what should be done about this is to have separate sizes for the
>> MCV list and the histogram, where the MCV list is automatically sized
>> during ANALYZE.


It's been suggested multiple times that we should base our sample size
on a % of the table, or at least offer that as an option.  I've pointed
out (with math, which Simon wrote a prototype for) that doing
block-based sampling instead of random-row sampling would allow us to
collect, say, 2% of a very large table without more I/O than we're doing
now.

Nathan Boley has also shown that we could get tremendously better
estimates without additional sampling if our statistics collector
recognized common patterns such as normal, linear and geometric
distributions.  Right now our whole stats system assumes a completely
random distribution.

So, I think we could easily be quite a bit smarter than just increasing
the MCV.  Although that might be a nice start.

-- 
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] default_statistics_target WAS: max_wal_senders must die

Reply via email to