This sounds like a good idea. - Luke
Msg is shrt cuz m on ma treo -----Original Message----- From: Simon Riggs [mailto:[EMAIL PROTECTED] Sent: Monday, March 05, 2007 02:37 PM Eastern Standard Time To: Josh Berkus; Tom Lane; Pavan Deolasee; Mark Kirkwood; Gavin Sherry; Luke Lonergan; PGSQL Hackers; Doug Rady; Sherry Moore Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Bug: Buffer cache is not scan resistant On Mon, 2007-03-05 at 10:46 -0800, Josh Berkus wrote: > Tom, > > > I seem to recall that we've previously discussed the idea of letting the > > clock sweep decrement the usage_count before testing for 0, so that a > > buffer could be reused on the first sweep after it was initially used, > > but that we rejected it as being a bad idea. But at least with large > > shared_buffers it doesn't sound like such a bad idea. > Note, though, that the current algorithm is working very, very well for OLTP > benchmarks, so we'd want to be careful not to gain performance in one area at > the expense of another. Agreed. What we should also add to the analysis is that this effect only occurs when only uniform workloads is present, like SeqScan, VACUUM or COPY. When you have lots of indexed access the scan workloads don't have as much effect on the cache pollution as we are seeing in these tests. Itakgaki-san and I were discussing in January the idea of cache-looping, whereby a process begins to reuse its own buffers in a ring of ~32 buffers. When we cycle back round, if usage_count==1 then we assume that we can reuse that buffer. This avoids cache swamping for read and write workloads, plus avoids too-frequent WAL writing for VACUUM. It would be simple to implement the ring buffer and enable/disable it with a hint StrategyHintCyclicBufferReuse() in a similar manner to the hint VACUUM provides now. This would maintain the beneficial behaviour for OLTP, while keeping data within the L2 cache for DSS and bulk workloads. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com