On Sun, 27 Jan 2008, [EMAIL PROTECTED] wrote:
ok, at the moment i got some traffic and my load is at 1.5. But now with logging the timestamp I have seen that the long durations are quite regular at intervals of 10 minutes.
Sure sounds like checkpoints. You should turn on the checkpoint warning feature so it always triggers and see if the long queries completely just after the checkpoints finish. Notes on that and what you can do to possibly improve checkpoint behavior are at http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm The early parts of that mostly refer to 8.2 but 8.1 is basically the same in this area.
but what in hell can make nextval take so long? even if checkpointing is badly configured.
You're in a situation where your amount of RAM far exceeds your disk I/O capabilities. Brutally bad checkpoints are easy to encounter in that setup. Linux by default will use 10% of RAM to hold writes. At checkpoint time, that entire Linux buffer cache has to be cleared of database writes on top of what's written by the checkpoint itself. How long do you think it takes to write >800MB of database data with a significant random-access component to it when your disk is a simple RAID-1? 20 seconds is not out of the question.
You may want to significantly reduce the size of the Linux write buffer and see if that helps. http://www.westnet.com/~gsmith/content/linux-pdflush.htm goes over theory and suggestions here.
I always thought that nextval is one of the fastest operations. So if it takes 500 ms, fine. things like this can always happen, but 20 seconds sounds more like a hardware failure. But i can't see any.
Just about everything gets blocked behind the worse checkpoint spikes. The thing that kind of bothers me about your case is that I'd expect other queries would also be blocked and you'd have a whole set of >250ms ones lined up just after the checkpoint is done. That you're only reporting issues with nextval makes me wonder if there isn't some other locking driving the main behavior, perhaps something that just gets worse at checkpoint time rather than being directly caused by it.
-- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster