Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

Michael Paquier Wed, 04 Nov 2015 22:01:41 -0800

On Wed, Nov 4, 2015 at 7:33 PM, Andres Freund <[email protected]> wrote:
> On 2015-11-04 16:01:28 +0900, Michael Paquier wrote:
>> On Wed, Nov 4, 2015 at 8:39 AM, Andres Freund <[email protected]> wrote:
>> > On November 4, 2015 12:37:02 AM GMT+01:00, Michael Paquier wrote:
>> >>On a completely idle system, I don't think we should log any standby
>> >>records. This is what ~9.3 does.
>> >
>> > Are you sure? I think it'll around checkpoints, no? I thought Heikki had 
>> > fixed that, but looking sound that doesn't seem to be the case.
>>
>> Er, yes, sorry. I should have used clearer words: I meant idle system
>> with something running nothing including internal checkpoints.
>
> Uh, but you'll always have checkpoints happen on wal_level =
> hot_standby, even in 9.3?  Maybe I'm not parsing your sentence right.


Reading again my previous sentence I cannot get the meaning of it
myself :) Well, I just meant that in ~9.3 LogStandbySnapshot() is
called at each checkpoint, checkpoints occurring after
checkpoint_timeout even if the system is idle.

> As soon as a single checkpoint ever happened the early-return logic in
> CreateCheckPoint() will fail to take the LogStandbySnapshot() in
> CreateCheckPoint() into account. The test is:
>     if (curInsert == ControlFile->checkPoint +
>         MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
>         ControlFile->checkPoint == ControlFile->checkPointCopy.redo)
> which obviously doesn't work if there's been a WAL record logged after
> the redo pointer has been determined etc.

Yes. If segment switches are enforced at a pace faster than
checkpoint_timeout, this check considers that a checkpoint needs to
happen because a SWITCH_XLOG record is in-between. I am a bit
surprised that this should happen actually. The segment switch
triggers a checkpoint record, and vice-versa, even for idle systems.
Shouldn't we make this check a bit smarter then?

> The reason that a single checkpoint is needed to "jumpstart" the
> pointless checkpoints is that otherwise we'll never have issued a
> LogStandbySnapshot() and thus the above code block works if we started
> from a proper shutdown checkpoint.
>
> Independent of the idle issue, it seems to me that the location of the
> LogStandbySnapshot() is actually rather suboptimal - it really should
> really be before the CheckPointGuts(), not afterwards. As closer it's to
> the redo pointer of the checkpoint a hot standby node starts up from,
> the sooner that node can reach consistency.  There's no difference for
> the first time a node starts from a basebackup (since we gotta replay
> that checkpoint anyway before we're consistent), but if we start from a
> restartpoint...

Agreed. LogStandbySnapshot() is called after CheckPointGuts() since
its introduction in efc16ea5. This may save time. This would surely be
a master-only optimization though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

Reply via email to