Re: [HACKERS] Measuring replay lag

Thomas Munro Wed, 04 Jan 2017 03:03:52 -0800

On Wed, Jan 4, 2017 at 8:58 PM, Simon Riggs <[email protected]> wrote:
> On 3 January 2017 at 23:22, Thomas Munro <[email protected]> 
> wrote:
>
>>> I don't see why that would be unacceptable. If we do it for
>>> remote_apply, why not also do it for other modes? Whatever the
>>> reasoning was for remote_apply should work for other modes. I should
>>> add it was originally designed to be that way by me, so must have been
>>> changed later.
>>
>> You can achieve that with this patch by setting
>> replication_lag_sample_interval to 0.
>
> I wonder why you ignore my mention of the bug in the correct mechanism?


I didn't have an opinion on that yet, but looking now I think there is
no bug:  I was wrong about the current reply frequency.  This comment
above XLogWalRcvSendReply confused me:

 * If 'force' is not set, the message is only sent if enough time has
 * passed since last status update to reach wal_receiver_status_interval.

Actually it's sent if 'force' is set, enough time has passed, or
either of the write or flush positions has moved.  So we're already
sending replies after every write and flush, as you said we should.

So perhaps I should get rid of that replication_lag_sample_interval
GUC and send back apply timestamps frequently, as you were saying.  It
would add up to a third more replies.

The effective sample rate would still be lowered when the fixed sized
buffers fill up and samples have to be dropped, and that'd be more
likely without that GUC. With the GUC, it doesn't start happening
until lag reaches XLOG_TIMESTAMP_BUFFER_SIZE *
replication_lag_sample_interval = ~2 hours with defaults, whereas
without rate limiting you might only need to get
XLOG_TIMESTAMP_BUFFER_SIZE 'w' messages behind before we start
dropping samples.  Maybe that's perfectly OK, I'm not sure.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Measuring replay lag

Reply via email to