On 12/10/2013 01:33 PM, Mark Kirkwood wrote: > Yeah - and we seem to be back to Josh's point about needing 'some math' > to cope with the rows within a block not being a purely random selection.
Well, sometimes they are effectively random. But sometimes they are not. The Chaudri et al paper had a formula for estimating randomness based on the grouping of rows in each block, assuming that the sampled blocks were widely spaced (if they aren't there's not much you can do). This is where you get up to needing a 5% sample; you need to take enough blocks that you're confident that the blocks you sampled are representative of the population. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
