Jeff Davis <[EMAIL PROTECTED]> writes: > What's happening is that you have a checkpoint_timeout of 5 minutes, and > that checkpoint must write a checkpoint record in the WAL, prompting the > archiving.
> If you want it to happen less frequently, it's often safe to have > checkpoint timeout set to something larger by a reasonable amount. I think you're confusing checkpoint_timeout and archive_timeout... I seem to recall this behavior having been discussed before, but I can't find it in the archives right now. What is happening is that after each checkpoint_timeout, we test to see if we need to write a new checkpoint; which is determined by whether anything's been inserted into WAL since the start of the last checkpoint. And after each archive_timeout, we test to see if we need to flush the current WAL segment out to the archive; which is determined by whether the write pointer is currently exactly at the start of a segment or not. Which would be fine, except that the "has anything been inserted since last checkpoint" test is actually done by seeing if the WAL insert pointer has moved. Which it will have, if we did an archive flush. And that means that each of these activities makes it look to the other one like something has happened, and so you get a checkpoint record every checkpoint_timeout seconds, and then we flush the entire WAL file (containing only that record), even if the database is in reality completely idle. Lather, rinse, repeat. In the prior discussion that I seem to remember, we didn't think of a decent solution, and it kinda fell off the radar since zero-activity isn't too interesting to a lot of folks. However, chewing on it again I think I've come up with a good idea that will fix this and actually simplify the code a bit: * Add a boolean flag insertedXLog to XLogCtlInsert, which means "at least one WAL record has been inserted since start of last checkpoint". Also add a flag completedCkpt somewhere in XLogCtlData, which means "checkpoint successfully completed"; this second flag is only used by checkpoint so it can be considered as being protected by the CheckpointLock. At startup we can initialize insertedXLog = false, completedCkpt = true. * XLogInsert sets insertedXLog to true while holding WALInsertLock, *except* when inserting either a checkpoint record or an xlog switch record; in those cases it doesn't change the flag. * CreateCheckpoint replaces its current rather complex test (lines 5693-5703 in CVS-tip xlog.c) with "if insertedXLog is clear and completedCkpt is set, we need not checkpoint". If it does have to perform a checkpoint, it clears both flags before releasing WALInsertLock. * After successful completion of a checkpoint, completedCkpt gets set. Because insertedXLog is cleared at the same time the checkpoint's REDO pointer is determined, this will correctly implement the requirement of detecting whether anything has been inserted since the last REDO point. This replaces the current indirect test involving comparing the last checkpoint's REDO pointer to its own address. However we have to not set insertedXLog when we finally do insert the checkpoint record, thus the special case is needed in XLogInsert. The other special case of ignoring xlog switch is what's needed to fix the bug, and is obviously OK because an xlog switch doesn't represent a checkpointable change. The reason we need the completedCkpt flag is that if a checkpoint fails partway through, it would nonetheless have cleared insertedXLog, and we don't want that to cause us to not retry the checkpoint next time. This is slightly warty but it certainly seems a lot clearer than the current test in lines 5693-5703. The couple of lines to be added to XLogInsert should have negligible performance impact. Comments? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq