Josh Berkus <josh@agliodbs.com> writes: > So this is obviously a major performance problem. It could be fixed by > turning off checkpointing completely, but I don't think that's really > feasable. Any clue on why clock-sweep should be so slammed by checkpoints?
Hm, notice that the processor utilization doesn't actually drop all that much, so it seems it's not fundamentally an "I/O storm" kind of issue. I'm thinking that the issue may be that just after a checkpoint, each modification of a page incurs a dump of the whole page into WAL, with attendant CRC-calculation and other costs. The reason the long intercheckpoint interval yields such nifty performance is that it lets you ramp up into a regime where almost none of the pages being touched need to be dumped to WAL as a whole. Unfortunately that regime hasn't got a lot to do with reality ... You could test this theory by disabling the page-dump-out logic to see what happens to the performance curve. In CVS tip, look at XLogCheckBuffer() in src/backend/access/transam/xlog.c, and dike out the whole large if() in it --- just have it set *lsn and return false. (I assume this *is* CVS tip, or near to it? The recent CRC32 and omit-the-hole changes should affect the costs of this quite a bit.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]