Re: [Slony1-general] slony backup during pg_dump

Christopher Browne Wed, 12 Jun 2019 09:33:33 -0700

On Tue, Jun 11, 2019 at 3:44 PM Tory M Blue <tmb...@gmail.com> wrote:


>
>
> On Tue, Jun 11, 2019 at 12:26 PM Christopher Browne <cbbro...@afilias.info>
> wrote:
>
>>
>>
>> On Tue, Jun 11, 2019 at 1:30 PM Tory M Blue <tmb...@gmail.com> wrote:
>>
>>> I'm wondering if there is anything I can do to prevent what appears to
>>> be slony replication backup (not replicating) during a pg_dump. I'm using a
>>> -N slonschema, so wondering if there is any other techniques to prevent
>>> slon replication delay during pg_dumps?
>>>
>>
>> Hmm.  I remember the "big deal" being to ensure that the Slony schema was
>> left out, as that would definitely lead to locking that would cause
>> replication to fall behind.
>>
>> It's possible that something new has come along that would lock in a
>> troublesome way.
>>
>> I suggest that you take a look in pg_catalog.pg_locks to see what objects
>> Slony is waiting on, thus, looking for:
>>
>>   select * from pg_catalog.pg_locks where not granted;
>>
>> Some further work needs to be done to trace that to exactly what objects
>> are locked.
>>
>
> Ya will take a look, I also thought that bypassing the slon schema would
> solve it but apparently not, we can have 2-10million count in the sl_log
> before it clears and it's apparent that the replication is not happening,
> so I'll poke around some more. Glad to see folks are still around :)
>

Well, growth of sl_log during the backup is fine, and not an indication of
anything untowards.  Given that you opened a transaction when pg_dump
began, and that has not completed, that will lead to Slony not trimming
anything out of sl_log during the backup.  That's normal, routine stuff.
Nothing's wrong (not on the basis of that, anyways).

On the origin node, you can query the view sl_status, which will show how
far behind replication appears to be.  (The view is available on all nodes;
it does not return terribly interesting results on nodes other than the
origin.)

Here's an example of replication being very far behind:

usaorigin=# select * from _usa.sl_status limit 3;
-[ RECORD 1 ]-------------+------------------------------
st_origin                 | 1
st_received               | 3
st_last_event             | 5000002124
st_last_event_ts          | 2019-04-10 20:02:43.833414+00
st_last_received          | 5000002124
st_last_received_ts       | 2019-04-10 20:02:43.885806+00
st_last_received_event_ts | 2019-04-10 20:02:43.833414+00
st_lag_num_events         | 0
st_lag_time               | 62 days 20:23:14.029816
-[ RECORD 2 ]-------------+------------------------------
st_origin                 | 1
st_received               | 4
st_last_event             | 5000002124
st_last_event_ts          | 2019-04-10 20:02:43.833414+00
st_last_received          | 5000002122
st_last_received_ts       | 2019-04-10 20:02:33.870577+00
st_last_received_event_ts | 2019-04-10 20:02:23.76373+00
st_lag_num_events         | 2
st_lag_time               | 62 days 20:23:34.0995
-[ RECORD 3 ]-------------+------------------------------
st_origin                 | 1
st_received               | 5
st_last_event             | 5000002124
st_last_event_ts          | 2019-04-10 20:02:43.833414+00
st_last_received          | 5000002122
st_last_received_ts       | 2019-04-10 20:02:33.898605+00
st_last_received_event_ts | 2019-04-10 20:02:23.76373+00
st_lag_num_events         | 2
st_lag_time               | 62 days 20:23:34.0995

That is a replication cluster where I shut down slon processes back in
April :-)

It's only behind by a few events, for the self-same reason :-)

If the last event received coincides with the time at which you started the
pg_dump run, well, that sure sounds like a smoking gun.

If, on the other hand, SYNC events are being processed, just at slower
speed than you'd like, then perhaps the problem is that the I/O induced by
pg_dump is delaying replication, and that's more a hardware problem than a
software one.

And if replication is a few seconds behind, while the system is busy, as
indicated by st_lag_time being a few seconds, then it's possible that there
is no problem there.

_______________________________________________
Slony1-general mailing list
Slony1-general@lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general

Re: [Slony1-general] slony backup during pg_dump

Reply via email to