Re: [HACKERS] Redesigning checkpoint_segments

Venkata Balaji N Sun, 22 Feb 2015 22:22:14 -0800

On Sat, Feb 14, 2015 at 4:43 AM, Heikki Linnakangas <[email protected]
> wrote:


> On 02/04/2015 11:41 PM, Josh Berkus wrote:
>
>> On 02/04/2015 12:06 PM, Robert Haas wrote:
>>
>>> On Wed, Feb 4, 2015 at 1:05 PM, Josh Berkus <[email protected]> wrote:
>>>
>>>> Let me push "max_wal_size" and "min_wal_size" again as our new parameter
>>>> names, because:
>>>>
>>>> * does what it says on the tin
>>>> * new user friendly
>>>> * encourages people to express it in MB, not segments
>>>> * very different from the old name, so people will know it works
>>>> differently
>>>>
>>>
>>> That's not bad.  If we added a hard WAL limit in a future release, how
>>> would that fit into this naming scheme?
>>>
>>
>> Well, first, nobody's at present proposing a patch to add a hard limit,
>> so I'm reluctant to choose non-obvious names to avoid conflict with a
>> feature nobody may ever write.  There's a number of reasons a hard limit
>> would be difficult and/or undesirable.
>>
>> If we did add one, I'd suggest calling it "wal_size_limit" or something
>> similar.  However, we're most likely to only implement the limit for
>> archives, which means that it might acually be called
>> "archive_buffer_limit" or something more to the point.
>>
>
> Ok, I don't hear any loud objections to min_wal_size and max_wal_size, so
> let's go with that then.
>
> Attached is a new version of this. It now comes in four patches. The first
> three are just GUC-related preliminary work, the first of which I posted on
> a separate thread today.
>

I applied all the 4 patches to the latest master successfully and performed
a test with heavy continuous load. I see no much difference in the
checkpoint behaviour and all seems to be working as expected.

I did a test with following parameter values -

max_wal_size = 10000MB
min_wal_size = 1000MB
checkpoint_timeout = 5min

Upon performing a heavy load operation, the checkpoints were occurring
based on timeouts.

pg_xlog size fluctuated a bit (not very much). Initially few mins pg_xlog
size stayed at 3.3G and gradually increased to 5.5G max during the
operation. There was a continuous fluctuation on number of segments being
removed+recycled.

A part of the checkpoint logs are as follows -

2015-02-23 15:16:00.318 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:16:53.943 GMT-10 LOG:  checkpoint complete: wrote 3010
buffers (18.4%); 0 transaction log file(s) added, 0 removed, 159 recycled;
write=27.171 s, sync=25.945 s, total=53.625 s; sync files=20, longest=5.376
s, average=1.297 s; distance=2748844 kB, estimate=2748844 kB
2015-02-23 15:21:00.438 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:22:01.352 GMT-10 LOG:  checkpoint complete: wrote 2812
buffers (17.2%); 0 transaction log file(s) added, 0 removed, 168 recycled;
write=25.351 s, sync=35.346 s, total=60.914 s; sync files=34, longest=9.025
s, average=1.039 s; distance=1983318 kB, estimate=2672291 kB
2015-02-23 15:26:00.314 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:26:25.612 GMT-10 LOG:  checkpoint complete: wrote 2510
buffers (15.3%); 0 transaction log file(s) added, 0 removed, 121 recycled;
write=22.623 s, sync=2.477 s, total=25.297 s; sync files=20, longest=1.418
s, average=0.123 s; distance=2537230 kB, estimate=2658785 kB
2015-02-23 15:31:00.477 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:31:25.925 GMT-10 LOG:  checkpoint complete: wrote 2625
buffers (16.0%); 0 transaction log file(s) added, 0 removed, 155 recycled;
write=23.657 s, sync=1.592 s, total=25.447 s; sync files=13, longest=0.319
s, average=0.122 s; distance=2797386 kB, estimate=2797386 kB
2015-02-23 15:36:00.607 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:36:52.686 GMT-10 LOG:  checkpoint complete: wrote 3473
buffers (21.2%); 0 transaction log file(s) added, 0 removed, 171 recycled;
write=31.257 s, sync=20.446 s, total=52.078 s; sync files=33, longest=4.512
s, average=0.619 s; distance=2153903 kB, estimate=2733038 kB
2015-02-23 15:41:00.675 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:41:25.092 GMT-10 LOG:  checkpoint complete: wrote 2456
buffers (15.0%); 0 transaction log file(s) added, 0 removed, 131 recycled;
write=21.974 s, sync=2.282 s, total=24.417 s; sync files=27, longest=1.275
s, average=0.084 s; distance=2258648 kB, estimate=2685599 kB
2015-02-23 15:46:00.671 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:46:26.757 GMT-10 LOG:  checkpoint complete: wrote 2644
buffers (16.1%); 0 transaction log file(s) added, 0 removed, 138 recycled;
write=23.619 s, sync=2.181 s, total=26.086 s; sync files=12, longest=0.709
s, average=0.181 s; distance=2787124 kB, estimate=2787124 kB
2015-02-23 15:51:00.509 GMT-10 LOG:  checkpoint starting: time
2015-02-23 15:53:30.793 GMT-10 LOG:  checkpoint complete: wrote 13408
buffers (81.8%); 0 transaction log file(s) added, 0 removed, 170 recycled;
write=149.432 s, sync=0.664 s, total=150.284 s; sync files=13,
longest=0.286 s, average=0.051 s; distance=1244483 kB, estimate=2632860 kB

Above checkpoint logs are generated at the time when pg_xlog size was at
5.4G

*Code* *Review*

I had a look at the code and do not have any comments from my end.

Regards,
Venkata Balaji N

Re: [HACKERS] Redesigning checkpoint_segments

Reply via email to