Zeugswetter Andreas SB SD wrote:
> Why not use the checkpointer itself inbetween checkpoints ?
> use a min and a max dirty setting like Informix. Start writing
> when more than max are dirty stop when at min. This avoids writing
> single pages (which is slow, since it cannot be grouped together
> by the OS).

Current approach is similar ... if I strech the IO and syncing over the entire 150-300 second checkpoint interval, grouping in 50 pages then sync()+nap, the system purr's pretty nice and without any peaks.

But how do you handle a write IO bound system then ? My thought was to let the checkpointer write dirty pages inbetween checkpoints with a min max,
but still try to do the checkpoint as fast as possible. I don't think
streching the checkpoint is good, since it needs to write hot pages, which the inbetween IO should avoid doing. The checkpointer would have two tasks,
that it handles alternately, checkpoint or flush LRU from max to min.


Andreas

By actually moving a lot of the IO work into the checkpointer. It asks the buffer strategy about the order in which dirty blocks would currently get evicted from the cache. The checkpointer now flushes them in that order. Your "hot pages" will be found at the end of that list and thus flushed last in the checkpoint, why it's good to keep them dirty longer.


The problem with the checkpointer flushing as fast as possible is, that the entire system literally freezes. In my tests I use something that resembles the transaction profile of a TPC-C including the thinking and keying times. Those are important as they are a very realistic thing. A stock 7.4.RC1 handles a right scaled DB with new_order response times of 0.2 to 1.5 seconds, but when the checkpoint occurs, it can't keep up and the response times go up to anything between 20-60 seconds. What makes the situation worse is that in the meantime, all simulated terminals hit the "send" button again, which lead's to a transaction pileup right during the checkpoint. It takes a while until the system recovers from that.

If the system is write-bound, the checkpointer will find that many dirty blocks that he has no time to nap and will burst them out as fast as possible anyway. Well, at least that's the theory.

PostgreSQL with the non-overwriting storage concept can never have hot-written pages for a long time anyway, can it? They fill up and cool down until vacuum.


Jan


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #


---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to