On 06.06.2013 15:31, Kevin Grittner wrote:
Heikki Linnakangas<hlinnakan...@vmware.com> wrote:
On 05.06.2013 22:18, Kevin Grittner wrote:
Heikki Linnakangas<hlinnakan...@vmware.com> wrote:
I was not thinking of making it a hard limit. It would be just
like checkpoint_segments from that point of view - if a
checkpoint takes a long time, max_wal_size might still be
exceeded.
Then I suggest we not use exactly that name. I feel quite sure we
would get complaints from people if something labeled as "max" was
exceeded -- especially if they set that to the actual size of a
filesystem dedicated to WAL files.
You're probably right. Any suggestions for a better name?
wal_size_soft_limit?
After reading later posts on the thread, I would be inclined to
support making it a hard limit and adapting the behavior to match.
Well, that's a lot more difficult to implement. And even if we have a
hard limit, I think many people would still want to have a soft limit
that would trigger a checkpoint, but would not stop WAL writes from
happening. So what would we call that?
I'd love to see a hard limit too, but I see that as an orthogonal feature.
How about calling the (soft) limit "checkpoint_wal_size"? That goes well
together with checkpoint_timeout, meaning that a checkpoint will be
triggered if you're about to exceed the given size.
I'm also concerned about the "spin up" from idle to high activity.
Perhaps a "min" should also be present, to mitigate repeated short
checkpoint cycles for "bursty" environments?
With my proposal, you wouldn't get repeated short checkpoint cycles with
bursts. The checkpoint interval would be controlled by
checkpoint_timeout, and checkpoint_wal_size. If there is a lot of
activity, then checkpoints will happen more frequently, as
checkpoint_wal_size is reached sooner. But it would not depend on the
activity in previous checkpoint cycles, only the current one, so it
would not make a difference if you have a continuously high load, or a
bursty one.
The history would matter for the calculation of how many segments to
preallocate/recycle, however. Under the proposal, that would be
calculated separately from checkpoint_wal_size, and for that we'd use
some kind of a moving average of how many segments were used in previous
cycles. A min setting might be useful for that. We could also try to
make WAL file creation cheaper, ie. by using posix_fallocate(), as was
proposed in another thread, and doing it in bgwriter or walwriter. That
would make it less important to get the estimate right, from a
performance point of view, although you'd still want to get it right to
avoid running out of disk space (having the segments preallocated
ensures that they are available when needed).
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers