Re: [HACKERS] ANALYZE sampling is too good

Gavin Flower Wed, 11 Dec 2013 12:17:25 -0800

On 12/12/13 08:39, Gavin Flower wrote:

On 12/12/13 08:31, Kevin Grittner wrote:

Gavin Flower <[email protected]> wrote:

For example, assume 1000 rows of 200 bytes and 1000 rows of 20 bytes,
using 400 byte pages.  In the pathologically worst case, assuming

maximum packing density and no page has both types: the large rowswould

occupy  500 pages and the smaller rows 50 pages. So if one selected 11
pages at random, you get about 10 pages of large rows and about one for
small rows!

With 10 * 2 = 20 large rows, and 1 * 20 = 20 small rows.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sorry, I've simply come up with well argued nonsense!

Kevin, you're dead right.


Cheers,
Gavin

I looked at:
http://www.postgresql.org/docs/current/interactive/storage-page-layout.html

this says that each row has an overhead, which suggests there should bea bias towards small rows.

There must be a lot of things going on, that I'm simply not aware of,that affect sampling bias...



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ANALYZE sampling is too good

Reply via email to