On 01/28/2016 11:11 AM, Tignor, Tom wrote:
>
>       Output below. They seem to be replicating normally, except for the 
> sl_log
> growth.

Indeed. Is there anything in the slon logs for those nodes that says why 
it doesn't finish the log switch?

Connect to the database as a the slony user.

To check if a log switch is indeed in progress, do

     SELECT last_value FROM _ams_cluster.sl_log_status;

It should be either 2 or 3. If it is 0 or 1, no log switch is in 
progress and you can start one with

     SELECCT _ams_cluster.logswitch_start();

If it is 2 or 3, then you can do

     SELECT _ams_cluster.logswitch_finish();

All these operations are harmless and will only do what is safely 
possible. Look at the code of logswitch_finish() to find out how it 
determines if the current log switch can be finished. In short, the 
cleanup thread is removing events from sl_event that have been confirmed 
by all nodes in the cluster. The function logswitch_finish() looks if 
there is anything left in sl_event, that belonged to that old log. If so 
it will not finish. Running those queries manually you can find out what 
that event is that is preventing the switch to finish.



>
>
> a...@ams-repl2.ams.netmgmt:~$ /a/third-party/postgresql/bin/psql -U
> ams_slony -d ams -c 'select * from _ams_cluster.sl_status'
>   st_origin | st_received | st_last_event |       st_last_event_ts        |
> st_last_received |      st_last_received_ts      |
> st_last_received_event_ts   | st_lag_num_events |   st_lag_time
> -----------+-------------+---------------+-------------------------------+-
> -----------------+-------------------------------+-------------------------
> ------+-------------------+-----------------
>           2 |           1 |    5000611610 | 2016-01-28 16:06:37.343826+00 |
>        5000611610 | 2016-01-28 16:06:38.843562+00 | 2016-01-28
> 16:06:37.343826+00 |                 0 | 00:00:09.201996
>           2 |           3 |    5000611610 | 2016-01-28 16:06:37.343826+00 |
>        5000611609 | 2016-01-28 16:06:29.851545+00 | 2016-01-28
> 16:06:27.341894+00 |                 1 | 00:00:19.203928
>           2 |           4 |    5000611610 | 2016-01-28 16:06:37.343826+00 |
>        5000611610 | 2016-01-28 16:06:38.710974+00 | 2016-01-28
> 16:06:37.343826+00 |                 0 | 00:00:09.201996
> (3 rows)
>
>
> a...@ams-repl3.lga.netmgmt:~$  /a/third-party/postgresql/bin/psql -U
> ams_slony -d ams -c 'select * from _ams_cluster.sl_status'
>   st_origin | st_received | st_last_event |       st_last_event_ts        |
> st_last_received |      st_last_received_ts      |
> st_last_received_event_ts   | st_lag_num_events |   st_lag_time
> -----------+-------------+---------------+-------------------------------+-
> -----------------+-------------------------------+-------------------------
> ------+-------------------+-----------------
>           3 |           4 |    5000654642 | 2016-01-28 16:07:05.493455+00 |
>        5000654642 | 2016-01-28 16:07:06.486539+00 | 2016-01-28
> 16:07:05.493455+00 |                 0 | 00:00:08.522529
>           3 |           1 |    5000654642 | 2016-01-28 16:07:05.493455+00 |
>        5000654642 | 2016-01-28 16:07:08.040292+00 | 2016-01-28
> 16:07:05.493455+00 |                 0 | 00:00:08.522529
>           3 |           2 |    5000654642 | 2016-01-28 16:07:05.493455+00 |
>        5000654642 | 2016-01-28 16:07:08.472049+00 | 2016-01-28
> 16:07:05.493455+00 |                 0 | 00:00:08.522529
> (3 rows)
>
>
> a...@ams-repl4.blr.netmgmt:~$  /a/third-party/postgresql/bin/psql -U
> ams_slony -d ams -c 'select * from _ams_cluster.sl_status'
>   st_origin | st_received | st_last_event |       st_last_event_ts        |
> st_last_received |      st_last_received_ts      |
> st_last_received_event_ts   | st_lag_num_events |   st_lag_time
> -----------+-------------+---------------+-------------------------------+-
> -----------------+-------------------------------+-------------------------
> ------+-------------------+-----------------
>           4 |           3 |    5000637483 | 2016-01-28 16:07:32.698809+00 |
>        5000637482 | 2016-01-28 16:07:28.731404+00 | 2016-01-28
> 16:07:22.695826+00 |                 1 | 00:00:19.077657
>           4 |           1 |    5000637483 | 2016-01-28 16:07:32.698809+00 |
>        5000637482 | 2016-01-28 16:07:24.839978+00 | 2016-01-28
> 16:07:22.695826+00 |                 1 | 00:00:19.077657
>           4 |           2 |    5000637483 | 2016-01-28 16:07:32.698809+00 |
>        5000637482 | 2016-01-28 16:07:22.926411+00 | 2016-01-28
> 16:07:22.695826+00 |                 1 | 00:00:19.077657
> (3 rows)
>
>
>
>       Tom    :-)
>
>
>
> On 1/28/16, 10:38 AM, "Jan Wieck" <j...@wi3ck.info> wrote:
>
>>On 01/28/2016 08:30 AM, Tignor, Tom wrote:
>>>
>>> Hello slony folks,
>>>  From my reading I¹m guessing (hoping) this isn¹t a new problem. I have
>>> a simple cluster with one provider replicating to three subscribers. The
>>> provider¹s changelog tables (sl_log_[1|2]) are fine, but the subscribers
>>> (with forwarding enabled) are all showing runaway growth. Looked through
>>> the FAQ and I don¹t see the node I dropped or any idle transactions as
>>> viable culprits. Are there other thoughts on the cause? Can I safely
>>> manually delete/truncate some/all of the changelog tables? These
>>> replicas are all leaf nodes. I only have forwarding turned on to allow
>>> for failover, and my replication rate is the 2 sec default.
>>> Thanks in advance for any insights.
>>
>>What is the output of the sl_status view "on those leaf nodes"?
>>
>>
>>>
>>> ams=# select
>>> pg_size_pretty(pg_total_relation_size('_ams_cluster.sl_log_1'));
>>>
>>>   pg_size_pretty
>>>
>>> ----------------
>>>
>>>   75 MB
>>>
>>> (1 row)
>>>
>>>
>>> ams=# select
>>> pg_size_pretty(pg_total_relation_size('_ams_cluster.sl_log_2'));
>>>
>>>   pg_size_pretty
>>>
>>> ----------------
>>>
>>>   34 GB
>>>
>>> (1 row)
>>>
>>>
>>> ams=# select * from _ams_cluster.sl_confirm where con_origin not in
>>> (select no_id from _ams_cluster.sl_node) or con_received not in (select
>>> no_id from _ams_cluster.sl_node);
>>>
>>>   con_origin | con_received | con_seqno | con_timestamp
>>>
>>> ------------+--------------+-----------+---------------
>>>
>>> (0 rows)
>>>
>>>
>>> ams=# select * from pg_stat_activity where current_query like '%IDLE%';
>>>
>>>   datid | datname | procpid | usesysid |  usename   |
>>> application_name      |  client_addr   | client_hostname | client_port |
>>>          backend_start         |          xact_start           |
>>>    query_start          | waiting |
>>>
>>>                          current_query
>>>
>>>
>>>-------+---------+---------+----------+------------+---------------------
>>>------+----------------+-----------------+-------------+-----------------
>>>--------------+-------------------------------+--------------------------
>>>-----+---------+---
>>>
>>> ----------------------------------------------------------------
>>>
>>>   16393 | ams     |    2611 |   212995 | ams_viewer |
>>>          | 88.221.209.10  |                 |       43328 | 2016-01-28
>>> 12:24:49.706389+00 |                               | 2016-01-28
>>> 13:18:02.427848+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |   12894 |   212995 | ams_viewer |
>>>          | 88.221.209.10  |                 |       60112 | 2016-01-28
>>> 12:47:26.230681+00 |                               | 2016-01-28
>>> 13:15:27.744242+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |   12884 |   212995 | ams_viewer |
>>>          | 88.221.209.10  |                 |       44302 | 2016-01-28
>>> 12:47:25.100006+00 |                               | 2016-01-28
>>> 13:15:27.936059+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |   23466 |   213867 | ams_slony  | psql
>>>          |                |                 |          -1 | 2016-01-28
>>> 13:11:32.030343+00 | 2016-01-28 13:18:37.283992+00 | 2016-01-28
>>> 13:18:37.283992+00 | f       | se
>>>
>>> lect * from pg_stat_activity where current_query like '%IDLE%';
>>>
>>>   16393 | ams     |    6719 |   213867 | ams_slony  |
>>> slon.origin_2_provider_2  | 60.254.150.133 |                 |
>>> 61806 | 2016-01-22 01:59:14.800129+00 |                               |
>>> 2016-01-28 13:18:25.935111+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    6718 |   213867 | ams_slony  |
>>> slon.origin_3_provider_2  | 60.254.150.133 |                 |
>>> 61805 | 2016-01-22 01:59:14.797655+00 |                               |
>>> 2016-01-28 13:18:34.304475+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    5505 |   213867 | ams_slony  |
>>> slon.origin_4_provider_2  | 80.67.75.105   |                 |
>>> 36477 | 2016-01-22 01:56:25.637046+00 |                               |
>>> 2016-01-28 13:18:36.1348+00   | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    5504 |   213867 | ams_slony  |
>>> slon.origin_3_provider_2  | 72.246.50.22   |                 |
>>> 51813 | 2016-01-22 01:56:25.240798+00 |                               |
>>> 2016-01-28 13:18:28.961629+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    5487 |   213867 | ams_slony  |
>>> slon.origin_4_provider_2  | 72.246.50.22   |                 |
>>> 51803 | 2016-01-22 01:56:22.896388+00 |                               |
>>> 2016-01-28 13:18:35.858913+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    5047 |   213867 | ams_slony  |
>>> slon.origin_2_provider_2  | 72.246.50.22   |                 |
>>> 51564 | 2016-01-22 01:55:23.600296+00 |                               |
>>> 2016-01-28 13:18:34.487192+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    5041 |   213867 | ams_slony  |
>>> slon.origin_2_provider_2  | 80.67.75.105   |                 |
>>> 36402 | 2016-01-22 01:55:22.964462+00 |                               |
>>> 2016-01-28 13:18:34.519066+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    6694 |   213867 | ams_slony  |
>>> slon.node_2_listen        | 60.254.150.133 |                 |
>>> 61795 | 2016-01-22 01:59:12.095052+00 |                               |
>>> 2016-01-28 13:18:27.928384+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4456 |   213867 | ams_slony  |
>>> slon.node_2_listen        | 72.246.50.22   |                 |
>>> 51238 | 2016-01-22 01:54:21.481355+00 |                               |
>>> 2016-01-28 13:18:36.766973+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4457 |   213867 | ams_slony  |
>>> slon.node_2_listen        | 80.67.75.105   |                 |
>>> 36333 | 2016-01-22 01:54:21.500456+00 |                               |
>>> 2016-01-28 13:18:36.204482+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4428 |   213867 | ams_slony  |
>>> slon.local_monitor        |                |                 |
>>> -1 | 2016-01-22 01:54:18.977015+00 |                               |
>>> 2016-01-28 13:18:36.652567+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4427 |   213867 | ams_slony  | slon.local_sync
>>>          |                |                 |          -1 | 2016-01-22
>>> 01:54:18.976932+00 |                               | 2016-01-28
>>> 13:18:36.151998+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4426 |   213867 | ams_slony  |
>>> slon.local_cleanup        |                |                 |
>>> -1 | 2016-01-22 01:54:18.976842+00 |                               |
>>> 2016-01-28 13:12:12.582921+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4425 |   213867 | ams_slony  |
>>> slon.remoteWorkerThread_4 |                |                 |
>>> -1 | 2016-01-22 01:54:18.976783+00 |                               |
>>> 2016-01-28 13:18:33.99715+00  | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4420 |   213867 | ams_slony  |
>>> slon.remoteWorkerThread_1 |                |                 |
>>> -1 | 2016-01-22 01:54:18.976548+00 |                               |
>>> 2016-01-28 13:18:33.561531+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4419 |   213867 | ams_slony  |
>>> slon.remoteWorkerThread_3 |                |                 |
>>> -1 | 2016-01-22 01:54:18.97647+00  |                               |
>>> 2016-01-28 13:18:34.808907+00 | f       | <I
>>>
>>> DLE>
>>>
>>>   16393 | ams     |    4413 |   213867 | ams_slony  | slon.local_listen
>>>          |                |                 |          -1 | 2016-01-22
>>> 01:54:18.965568+00 |                               | 2016-01-28
>>> 13:18:37.096159+00 | f       | <I
>>>
>>> DLE>
>>>
>>> (21 rows)
>>>
>>>
>>> ams=#
>>>
>>>
>>>
>>> Tom    :-)
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Slony1-general mailing list
>>> Slony1-general@lists.slony.info
>>> http://lists.slony.info/mailman/listinfo/slony1-general
>>>
>>
>>
>>--
>>Jan Wieck
>>Senior Software Engineer
>>http://slony.info
>


-- 
Jan Wieck
Senior Software Engineer
http://slony.info
_______________________________________________
Slony1-general mailing list
Slony1-general@lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general

Reply via email to