On Thu, 2005-11-03 at 08:03 -0800, Mark Wong wrote: > On Tue, 01 Nov 2005 07:32:32 +0000 > Simon Riggs <[EMAIL PROTECTED]> wrote: > > Concerned about the awful checkpointing. Can you bump wal_buffers to > > 8192 just to make sure? Thats way too high, but just to prove it. > > > > We need to rdeuce the number of blocks to be written at checkpoint. > > > > bgwriter_all_maxpages 5 -> 15 > > bgwriter_all_percent 0.333 > > bgwriter_delay 200 > > bgwriter_lru_maxpages 5 -> 7 > > bgwriter_lru_percent 1 > > > > shared_buffers set lower to 100000 > > (which should cause some amusement on-list) > > > Okay, here goes, all with the same source base w/ the lw.patch: > > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/44/ > only increased wal_buffers to 8192 from 2048 > 3242 notpm
That looks to me like a clear negative effect from increasing wal_buffers. Try putting it back down to 1024. Looks like we need to plug that gap. > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/43/ > only increased bgwriter_all_maxpages to 15, and bgwriter_lru_maxpages to 7 > 3019 notpm (but more interesting graph) Man that sucks. What the heck is happening there? Hackers - if you watching you should see this graph - it shows some very poor behaviour. I'm not happy with that performance at all.... any chance you could re- run that exact same test to see if we can get that repeatably? I see you have vm.dirty_writeback_centisecs = 0 which pretty much means we aren't ever writing to disk by the pdflush daemons, even when the bgwriter is active. Could we set the bgwriter stuff back to default and try vm.dirty_writeback_centisecs = 500 > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/45/ > Same as the previously listen run with hared_buffers lowered to 10000 > 2503 notpm Sorry, that was 100,000 not 10,000. Looks like we need dates on the log_line_prefix so we can check the logs. ...not sure about the oprofile results. Seems to show CreateLWLocks being as high as xlog_insert, which is mad. Either that shows startup time is excessive, or it means the oprofile timing range is too short. Not sure which. Best Regards, Simon Riggs ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq