On 06.06.2013 15:31, Kevin Grittner wrote:
Heikki Linnakangas<hlinnakan...@vmware.com>  wrote:
On 05.06.2013 22:18, Kevin Grittner wrote:
Heikki Linnakangas<hlinnakan...@vmware.com>   wrote:

I was not thinking of making it a hard limit. It would be just
like checkpoint_segments from that point of view - if a
checkpoint takes a long time, max_wal_size might still be
exceeded.

Then I suggest we not use exactly that name.  I feel quite sure we
would get complaints from people if something labeled as "max" was
exceeded -- especially if they set that to the actual size of a
filesystem dedicated to WAL files.

You're probably right. Any suggestions for a better name?
wal_size_soft_limit?

After reading later posts on the thread, I would be inclined to
support making it a hard limit and adapting the behavior to match.

Well, that's a lot more difficult to implement. And even if we have a hard limit, I think many people would still want to have a soft limit that would trigger a checkpoint, but would not stop WAL writes from happening. So what would we call that?

I'd love to see a hard limit too, but I see that as an orthogonal feature.

How about calling the (soft) limit "checkpoint_wal_size"? That goes well together with checkpoint_timeout, meaning that a checkpoint will be triggered if you're about to exceed the given size.

I'm also concerned about the "spin up" from idle to high activity.
Perhaps a "min" should also be present, to mitigate repeated short
checkpoint cycles for "bursty" environments?

With my proposal, you wouldn't get repeated short checkpoint cycles with bursts. The checkpoint interval would be controlled by checkpoint_timeout, and checkpoint_wal_size. If there is a lot of activity, then checkpoints will happen more frequently, as checkpoint_wal_size is reached sooner. But it would not depend on the activity in previous checkpoint cycles, only the current one, so it would not make a difference if you have a continuously high load, or a bursty one.

The history would matter for the calculation of how many segments to preallocate/recycle, however. Under the proposal, that would be calculated separately from checkpoint_wal_size, and for that we'd use some kind of a moving average of how many segments were used in previous cycles. A min setting might be useful for that. We could also try to make WAL file creation cheaper, ie. by using posix_fallocate(), as was proposed in another thread, and doing it in bgwriter or walwriter. That would make it less important to get the estimate right, from a performance point of view, although you'd still want to get it right to avoid running out of disk space (having the segments preallocated ensures that they are available when needed).

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to