On Tue, 2008-08-12 at 23:58 +0100, Gregory Stark wrote: > People lower random_page_cost because we're not doing a good job > estimating how much of a table is in cache.
Is it because of a bad estimate about how much of a table is in cache, or a bad assumption about the distribution of access to a table? If the planner were to know, for example, that 10-20% of the table is likely to be in cache, will that really make a difference in the plan? I suspect that it would mostly only matter when the entire table is cached, the correlation is low, and the query is somewhat selective (which is a possible use case, but fairly narrow). I suspect that this has more to do with the fact that some data is naturally going to be accessed much more frequently than other data in a large table. But how do you determine, at plan time, whether the query will be mostly accessing hot data, or cold data? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers