On 12/12/2004 9:43 PM, Neil Conway wrote:
On Sun, 2004-12-12 at 22:08 +0000, Simon Riggs wrote:> On Sun, 2004-12-12 at 05:46, Neil Conway wrote: > Is the plan to make bgwriter_percent = 100 the default setting?
Hmm...must confess that my only plan is: i) discover dynamic behaviour of bgwriter ii) fix any bugs or wierdness as quickly as possible iii) try to find a way to set the bgwriter defaults
I was just curious why you were bothering to special-case bgwriter_percent = 100 if it's not going to be the default setting (in which case I would be surprised if more than 1 in 10 users would take advantage of the patch).
Right now, bgwriter_delay is useless because the O(N) behaviour makes it impossible to set any lower when you have a large shared_buffers.
BTW, I wouldn't be _too_ worried about O(N) behavior, except that we do this scan while holding the BufMgrLock, which is a well known source of contention. So reducing the time we hold that lock would be good.
Your question has made me rethink the exact objective of the bgwriter's actions: The way it is coded now the bgwriter looks for dirty blocks, no matter where they are in the list.
Not sure what you mean. StrategyDirtyBufferList() returns the specified number of dirty buffers in order, starting with the T1/T2 LRUs and going back to the MRUs of both lists. bgwriter_percent effectively ignores some portion of the tail of that list, so we end up just flushing the buffers closest to the L1/L2 LRUs. How is this different from what you're describing?
bgwriter_percent would be the % of shared_buffers that are searched (from the LRU end) to see if they contain dirty buffers, which are then written to disk.
By definition, buffers closest to the LRU end of the lists are not frequently accessed. If we only search the N% of the lists closest to LRU, we will probably end up flushing just those pages to disk -- and then not flushing anything else to disk in the subsequent bgwriter calls because all the buffers close to the LRU will be non-dirty. That's okay if all we're concerned about is avoiding write() by a real backend, but we also want to smooth out checkpoint load, which I don't think this approach would do well.
I suggest just getting rid of bgwriter_percent: AFAICS bgwriter_maxpages is all the tuning we need, and I think "max # of pages to write" is a simpler and more logical tuning knob than "% of the buffer pool to scan looking for dirty buffers." So at each bufmgr invocation, we pick the at most bgwriter_maxpages dirty pages from the pool, using the pages closest to the LRUs of T1 and T2. I'd be happy to supply a patch to implement that if you think it sounds okay.
I too don't think that this approach will retain the checkpoing smooting effect, the current implementation has.
The real problem is that the "cleaner" the buffer pool is, the longer the scan for dirty buffers will take because the dirty blocks tend to be at the very end of the scan order. The real solution for this would be not to scan the whole pool, but to maintain a separate chain of only dirty buffers in LRU order.
Jan
-- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== [EMAIL PROTECTED] #
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend