Re: [HACKERS] Thoughts on statistics for continuously advancing columns

Greg Smith Wed, 30 Dec 2009 17:41:02 -0800

Joshua D. Drake wrote:

postgres=# analyze verbose test_ten_million;
INFO:  analyzing "public.test_ten_million"
INFO:  "test_ten_million": scanned 3000 of 44248 pages, containing 678000
live rows and 0 dead rows; 3000 rows in sample, 10000048 estimated total
rows
ANALYZE
Time: 20145.148 ms

At an ever larger table sizes, this would turn into 3000 random seeksall over the drive, one at a time because there's no async I/O here toqueue requests better than that for this access pattern. Let's say theytake 10ms each, not an unrealistic amount of time on current hardware.That's 30 seconds, best case, which is similar to what JD's example isshowing even on a pretty small data set. Under load it could easilytake over a minute, hammering the disks the whole time, and in a TOASTsituation you're doing even more work. It's not outrageous and itdoesn't scale linearly with table size, but it's not something you wantto happen any more than you have to either--consider the poor client whois trying to get their work done while that is going on.

On smaller tables, you're both more likely to grab a useful next pagevia readahead, and to just have the data you need cached in RAMalready. There's a couple of "shelves" in the response time to finishANALYZE as you exceed L1/L2 CPU cache size and RAM size, then it trailsdownward as the seeks get longer and longer once the data you need isspread further across the disk(s). That the logical beginning of adrive is much faster than the logical end doesn't help either. I shouldgenerate that graph again one day somewhere I can release it at...


--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Thoughts on statistics for continuously advancing columns

Reply via email to