"Simon Riggs" <[EMAIL PROTECTED]> writes: > On Tue, 2008-08-12 at 15:46 -0400, Tom Lane wrote: >> Simon Riggs <[EMAIL PROTECTED]> writes: >> > Proposal: Make the first block of a seq scan cost random_page_cost, then >> > after that every additional block costs seq_page_cost. >> >> This is only going to matter for a table of 1 block (or at least very >> few blocks), and for such a table it's highly likely that it's in RAM >> anyway. So I'm unconvinced that the proposed change represents a >> better model of reality.
I think the first block of a sequential scan is clearly a random access. If that doesn't represent reality well then perhaps we need to tackle both problems together. Somehow we need to discount scan i/o cost based on how much of the table we expect to be in cache. For 1-block tables if we should expect them to be in cache we should be zeroing out all the i/o cost whether random or sequential. > The access cost should be the same for a 1 block table, whether its on > disk or in memory. Uhm, huh? That can't be what you meant to write? > AFAICS the cost cross-over is much higher than the actual elapsed time > cross-over for both narrow and wide tables. > > Thats why using SET enable_seqscan=off helps performance in many cases, > or why people reduce random_page_cost to force index selection. People lower random_page_cost because we're not doing a good job estimating how much of a table is in cache. I think that would be a great target for some careful analysis. If you can come up with specific places and reasonable heuristics to discount i/o costs based on effective_cache_size and then demonstrate cases where it produces consistently better cost estimates that would be a huge help. I've been running benchmarks where I see accurate random_page_costs of 13-80 on uncached data on a moderate sized raid array. But of course when a some of the data is cached the effective random_page_cost is much much lower than that. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers