On 02/07/2012 03:23 PM, Bruce Momjian wrote:
Where did you see that there will be an improvement in the 9.2
documentation?  I don't see an improvement.

I commented that I'm hoping for an improvement in the documentation of how much timing overhead impacts attempts to measure this area better. That's from the "add timing of buffer I/O requests" feature submission. I'm not sure if Bene read too much into that or not; I didn't mean to imply that the docs around random_page_cost have gotten better.

This particular complaint is extremely common though, seems to pop up on one of the lists a few times each year. Your suggested doc fix is fine as a quick one, but I think it might be worth expanding further on this topic. Something discussing SSDs seems due here too. Here's a first draft of a longer discussion, to be inserted just after where it states the default value is 4.0:

True random access to mechanical disk storage will normally be more expensive than this default suggests. The value used is lower to reflect caching effects. Some common random accesses to disk, such as indexed reads, are considered likely to be in cache. The default value can be thought of as modeling random access as 40 times as expensive as sequential, while expecting that 90% of random reads will actually be cached.

If you believe a high cache rate is an incorrect assumption for your workload, you might increase random_page_cost to closer reflect the true cost of random reads against your storage. Correspondingly, if your data is likely to be completely cached, such as when the database is smaller than the total memory in the server, decreasing random_page_cost can be appropriate. Storage where the true cost of random reads is low, such as solid-state drives and similar memory-based devices, might also find lower values of random_page_cost better reflect the real-world cost of that operation.

===

I think of the value as being more like 80 times as expensive and a 95% hit rate, but the above seems more likely to turn into understandable math to a first-time reader of this section. I stopped just short of recommending a value for the completely cached case. I normally use 1.01 there; I know others prefer going fully to 1.0 instead. That argument seems like it could rage on for some time.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to