On Wed, Oct 15, 2014 at 1:11 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:

> On Fri, Aug 8, 2014 at 12:08 AM, Guillaume Lelarge <guilla...@lelarge.info
> > wrote:
>
>> Hi,
>>
>> As part of our monitoring work for our customers, we stumbled upon an
>> issue with our customers' servers who have a wal_keep_segments setting
>> higher than 0.
>>
>> We have a monitoring script that checks the number of WAL files in the
>> pg_xlog directory, according to the setting of three parameters
>> (checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
>> We usually add a percentage to the usual formula:
>>
>> greatest(
>>   (2 + checkpoint_completion_target) * checkpoint_segments + 1,
>>   checkpoint_segments + wal_keep_segments + 1
>> )
>>
>
> I think the first bug is even having this formula in the documentation to
> start with, and in trying to use it.
>
> "and will normally not be more than..."
>
> This may be "normal" for a toy system.  I think that the normal state for
> any system worth monitoring is that it has had load spikes at some point in
> the past.
>
> So it is the next part of the doc, which describes how many segments it
> climbs back down to upon recovering from a spike, which is the important
> one.  And that doesn't mention wal_keep_segments at all, which surely
> cannot be correct.
>
> I will try to independently derive the correct formula from the code, as
> you did, without looking too much at your derivation  first, and see if we
> get the same answer.
>

It looked to me that the formula, when descending from a previously
stressed state, would be:

greatest(1 + checkpoint_completion_target) * checkpoint_segments,
wal_keep_segments) + 1 +
2 * checkpoint_segments + 1

This assumes logs are filled evenly over a checkpoint cycle, which is
probably not true because there is a spike in full page writes right after
a checkpoint starts.

But I didn't have a great deal of confidence in my analysis.

The first line reflects the number of WAL that will be retained as-is, the
second is the number that will be recycled for future use before starting
to delete them.

My reading of the code is that wal_keep_segments is computed from the
current end of WAL (i.e the checkpoint record), not from the checkpoint
redo point.  If I distribute the part outside the 'greatest' into both
branches of the 'greatest', I don't get the same answer as you do for
either branch.

Then I started wondering if the number we keep for recycling is a good
choice, anyway.  2 * checkpoint_segments + 1 seems pretty large.  But then
again, given that we've reached the high-water-mark once, how unlikely are
we to reach it again?

Cheers,

Jeff

Reply via email to