On Fri, May 11, 2012 at 11:04 AM, Kevin Grittner <kevin.gritt...@wicourts.gov> wrote: > We > will be probing random page numbers 1..P for random tuple indexes > 1..M. So how many random probes by ctid does that require? The > chance of a hit on each probe is ((T/P)/M) -- the average number of > tuples per page divided by the number of tuple index values allowed. > So we need (S*T)/((T/P)/M) probes. Simplifying that winds up being > S*M*P the product of the sample size as a percentage, the maximum > tuple index on a page, and the number of pages. (A calculation some > may have jumped to as intuitively obvious.) > > So let's call the number of probes N. We randomly generate N > distinct ctid values, where each is a random page number 1..P and a > random index 1..M. We attempt to read each of these in block number > order, not following HOT chains. For each, if the tuple exists and > is visible, it is part of our result set. > > Since T cancels out of that equation, we don't need to worry about > estimating it. Results will be correct for any value of M which is > *at least* as large as the maximum tuple index in the table, > although values of M larger than that will degrade performance. The > same holds true for the number of pages.
The trouble is, AFAICS, that you can't bound M very well without scanning the whole table. I mean, it's bounded by theoretical limit, but that's it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers