Hello Greg, thank you for your reply.
[...]
If the tables are that busy, you may want to set autokick=0 and rely
one some other means to start the sync, such as setting it to run
every 10 seconds. This will reduce the overhead of having a trigger
on each table.

1) How scalable is bucardo : In other words, is there a sync limit
in bucardo, that could make a solution with lot of syncs not
scalable? For example, are syncs independent or not / dependent of a
process that controls all syncs, ie, if one process is blocked, does
it have impacts on others?
They are *mostly* independent. As long as there are no tables shared among
the syncs, things should run fairly smoothly. There is no sync limit
per se, but eventually you may see problems with the sheer number of
notices and perhaps KID processes.

OK.
My bad, the script I wrote to add new events was wrong:
When adding tables, I forgot to include the schema. If things were working correctly with only one event, it was worse and worse when adding new events because instead of adding 5 tables in one relgroup, 5 *(number of events) tables were added, leading to very bad things. So correcting my script to only add the tables I really liked to add (...) makes things better, no more scaling problem with multiple syncs.

Last question, may be related or not : I notice that some sync
sometimes become inactive. After that, I find no way to make then
work again, using bucardo activate XXX does not solve, and stopping
/ restarting the daemon does not help, and nothing special is
present in the logs to explain what is wrong. So why a sync can
become inactive, why and what to do in this case?
If a database involved in the sync is inactive, the sync will go
inactive as well. The logs should be giving some clue as to what
is going on. Try bumping the log_level up (perhaps to DEBUG) and
see if that gives you a better message.

To reproduce this, I "simply" kill postgres (killall -9 postgres) on the same host bucardo daemon is running on. (I am sure there are other ways to reproduce this, but at least, this method works) Then I restart postgres, restart bucardo, and check that fresh data is added in all tables.

After that, some of the syncs are still active, the other are stalled, and I find no way to
activate them again.
I notice that the CTL and KID processes are not started for those stalled sync:
<<<
# ps -ef|grep Bucard
bucardo 28054 1 3 14:49 ? 00:00:00 Bucardo Master Control Program v5.3.1. Active syncs: event_sync_inf_01,event_sync_inf_02,event_sync_inf_03,event_sync_inf_04,event_sync_inf_05,event_sync_inf_06,event_sync_inf_07,event_sync_inf_08,event_sync_inf_09,event_sync_inf_10,config_sync
bucardo  28184 28054  0 14:49 ?        00:00:00 Bucardo VAC.
bucardo 28197 28054 0 14:49 ? 00:00:00 Bucardo Controller. Sync "event_sync_inf_01" for relgroup "event_relgroup_inf_01" to dbs "event_dbgroup_inf_01" bucardo 28198 28054 0 14:49 ? 00:00:00 Bucardo Controller. Sync "event_sync_inf_08" for relgroup "event_relgroup_inf_08" to dbs "event_dbgroup_inf_08" bucardo 28200 28197 2 14:49 ? 00:00:00 Bucardo Kid. Sync "event_sync_inf_01" bucardo 28201 28054 0 14:49 ? 00:00:00 Bucardo Controller. Sync "event_sync_inf_10" for relgroup "event_relgroup_inf_10" to dbs "event_dbgroup_inf_10" bucardo 28204 28198 2 14:49 ? 00:00:00 Bucardo Kid. Sync "event_sync_inf_08" bucardo 28206 28054 0 14:49 ? 00:00:00 Bucardo Controller. Sync "config_sync" for relgroup "config_relgroup" to dbs "config_dbgroup" bucardo 28208 28201 2 14:49 ? 00:00:00 Bucardo Kid. Sync "event_sync_inf_10" bucardo 28211 28054 0 14:49 ? 00:00:00 Bucardo Controller. Sync "event_sync_inf_09" for relgroup "event_relgroup_inf_09" to dbs "event_dbgroup_inf_09" bucardo 28229 28206 2 14:49 ? 00:00:00 Bucardo Kid. Sync "config_sync" bucardo 28232 28211 2 14:49 ? 00:00:00 Bucardo Kid. Sync "event_sync_inf_09"

>>>

<<<
# bucardo status
PID of Bucardo MCP: 28054
Name State Last good Time Last I/D Last bad Time
=====================+========+============+=========+===========+===========+=========
event_sync_inf_01 | Good | 13:51:21 | 2s | 0/0 | 13:12:49 | 38m 34s event_sync_inf_02 | Bad | 13:12:46 | 38m 36s | 5/14 | 13:12:49 | 38m 34s event_sync_inf_03 | Bad | 13:12:47 | 38m 36s | 5/14 | 13:12:49 | 38m 34s event_sync_inf_04 | Bad | 13:12:46 | 38m 36s | 5/14 | 13:12:49 | 38m 34s event_sync_inf_05 | Bad | 13:12:46 | 38m 36s | 5/14 | 13:12:49 | 38m 34s event_sync_inf_06 | Bad | 13:12:47 | 38m 36s | 5/14 | 13:12:49 | 38m 34s event_sync_inf_07 | Bad | 13:12:46 | 38m 36s | 5/14 | 13:12:49 | 38m 34s event_sync_inf_08 | Good | 13:51:21 | 1s | 0/0 | 13:12:49 | 38m 34s event_sync_inf_09 | Good | 13:51:21 | 2s | 0/0 | 13:12:49 | 38m 34s event_sync_inf_10 | Good | 13:51:21 | 1s | 5/14 | 13:12:49 | 38m 34s config_sync | Good | 13:51:22 | 0s | 0/0 | 13:12:48 | 38m 34s
>>>

<<<
# bucardo status event_sync_inf_02
======================================================================
Last good                : Feb 27, 2015 13:12:46 (time to run: 1s)
Rows deleted/inserted    : 5 / 14
Last bad                 : Feb 27, 2015 13:12:49 (time until fail: 1s)
Sync name                : event_sync_inf_02
Current state            : Bad
Source relgroup/database : event_relgroup_inf_02 / event_db_bucardo_0
Tables in sync           : 8
Status                   : Stalled
Check time               : None
Overdue time             : 00:00:00
Expired time             : 00:00:00
Stayalive/Kidsalive      : Yes / Yes
Rebuild index            : No
Autokick                 : Yes
Onetimecopy              : No
Post-copy analyze        : Yes
Last error: : Failed : DBD::Pg::db do failed: at /usr/share/perl5/Bucardo.pm line 3401. Line: 5453 Main DB state: ? Error: none DB event_db_bucardo_0 state: ? Error: none DB event_db_bucardo_1 state: 22000 Error: 7 (KID 16002)
======================================================================

>>>

Here some logs (only for one stalled sync the event_sync_inf_02)
<<<
2015-02-27 14:49:36 [ll101] Bucardo info [28054]: MCP Checking stalled sync event_sync_inf_02 #local1,devel 2015-02-27 14:49:36 [ll101] Bucardo info [28054]: MCP LISTEN for "bucardo_activate_sync_event_sync_inf_02" on "bucardo" (line 7727) #local1,devel 2015-02-27 14:49:36 [ll101] Bucardo info [28054]: MCP LISTEN for "bucardo_reload_sync_event_sync_inf_02" on "bucardo" (line 7727) #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Activating sync "event_sync_inf_02" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Running validate_sync on "event_sync_inf_02" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source sequence "event_inf_02.table4_id_seq" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.table4_id_seq to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source sequence "event_inf_02.streams_id_seq" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.streams_id_seq to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source sequence "event_inf_02.table1_id_seq" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.table1_id_seq to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source table "event_inf_02.table1" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.table1 to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source table "event_inf_02.table4" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.table4 to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source table "event_inf_02.streams" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.streams to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source table "event_inf_02.table3" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.table3 to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Inspecting source table "event_inf_02.table2" on database "event_db_bucardo_0" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Set conflict strategy for event_inf_02.table2 to "bucardo_latest" #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Listen for kick_sync_event_sync_inf_02 on event_db_bucardo_0 (source) #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Listen for kick_sync_event_sync_inf_02 on event_db_bucardo_1 (source) #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP Sending NOTIFY "activated_sync_event_sync_inf_02" (line 7218) skip_commit=0 #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP UNLISTEN for "bucardo_activate_sync_event_sync_inf_02" on "bucardo" (line 7220) #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP LISTEN for "bucardo_deactivate_sync_event_sync_inf_02" on "bucardo" (line 7222) #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP LISTEN for "bucardo_kick_sync_event_sync_inf_02" on "bucardo" (line 7223) #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP LISTEN for "bucardo_pause_sync_event_sync_inf_02" on "bucardo" (line 7224) #local1,devel 2015-02-27 14:49:37 [ll101] Bucardo info [28054]: MCP LISTEN for "bucardo_resume_sync_event_sync_inf_02" on "bucardo" (line 7225) #local1,devel 2015-02-27 14:49:39 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 5 times from 18258 (line 980) #local1,devel 2015-02-27 14:49:40 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 from 7130 (line 980) #local1,devel 2015-02-27 14:49:40 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 4 times from 7131 (and 2 other PIDs) (line 980) #local1,devel 2015-02-27 14:49:42 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 3 times from 3668 (and 2 other PIDs) (line 980) #local1,devel 2015-02-27 14:49:42 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 2 times from 7128 (line 980) #local1,devel 2015-02-27 14:49:44 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 3 times from 7130 (line 980) #local1,devel 2015-02-27 14:49:44 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 2 times from 7128 (line 980) #local1,devel 2015-02-27 14:49:46 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 from 7128 (line 980) #local1,devel 2015-02-27 14:49:46 [ll101] Bucardo info [28054]: MCP Got NOTICE kick_sync_event_sync_inf_02 4 times from 7130 (and 2 other PIDs) (line 980) #local1,devel

>>>

Is there something I can do to make things working without removing/adding the sync corresponding to stalled events in bucardo ?

Thanks and regards,

Sylvain

_______________________________________________
Bucardo-general mailing list
[email protected]
https://mail.endcrypt.com/mailman/listinfo/bucardo-general

Reply via email to