Related to the issues I emailed about yesterday - I deleted the network route between the two postgres servers, did a bunch of updates on the remote server, then turned back on the connection. bucardo on the 'local' server is now doing this, repeatedly, every minute or so:

NOTICE: Rows deleted from delta_public_group_stats_main_history_56: 0 Rows deleted from track_public_group_stats_main_history_56: 0 (1585) [Fri Sep 20 11:13:08 2013] KID Delta count for local_trumgr_group.public.int_unit_moves : 100
[unnecessary lines elided]
(1585) [Fri Sep 20 11:13:09 2013] KID Conflicts for public.unit_stats_32: 1
DBI::db=HASH(0x99c54d0)->disconnect invalidates 9 active statement handles (either destroy statement handles or call finish on them before disconnecting) at /usr/local/share/perl/5.10.1/Bucardo.pm line 2250. DBI::db=HASH(0xa162f10)->disconnect invalidates 9 active statement handles (either destroy statement handles or call finish on them before disconnecting) at /usr/local/share/perl/5.10.1/Bucardo.pm line 2250. (1585) [Fri Sep 20 11:13:09 2013] KID Kid 1585 exiting at cleanup_kid. Sync "trumgr_group_sync" public.unit_stats_32 Reason: Invalid conflict_strategy 'latest' used for public.unit_stats_32 at /usr/local/share/perl/5.10.1/Bucardo.pm line 3586. Line: 4766
=======================================================================================================
That 'invalid conflict strategy' is bothersome. I check bucardo config, and it listed "bucardo_latest" as the default strategy, but i had manually set the strategy to 'latest' for each sync during setup.

I then ran 'bucardo stop', and I get this at the shutdown:

(23854) [Fri Sep 20 11:15:11 2013] CTL Found stopfile "/var/run/bucardo/fullstopbucardo": exiting (23854) [Fri Sep 20 11:15:11 2013] CTL Warning! Controller for "trumgr_group_sync" was killed at line 1747: Found stopfile: stop | (23854) [Fri Sep 20 11:15:12 2013] CTL Controller 23854 exiting at cleanup_controller. Reason: Found stopfile: stop | (23734) [Fri Sep 20 11:15:13 2013] MCP End of cleanup_mcp. Sys time: Fri Sep 20 11:15:13 2013. Database time: 2013-09-20 11:15:13.058532-07
(23734) [Fri Sep 20 11:15:13 2013] MCP Exiting
*** glibc detected *** Bucardo VAC.: double free or corruption (!prev): 0x0a311398 ***
======= Backtrace: =========
/lib/i686/cmov/libc.so.6(+0x6b381)[0xb75ff381]
/lib/i686/cmov/libc.so.6(+0x6cbd8)[0xb7600bd8]
/lib/i686/cmov/libc.so.6(cfree+0x6d)[0xb7603cbd]
/usr/lib/libpq.so.5(PQclear+0xf6)[0xb733c5a6]
/usr/local/lib/perl/5.10.1/auto/DBD/Pg/Pg.so(pg_st_destroy+0x17b)[0xb736f40b]
/usr/local/lib/perl/5.10.1/auto/DBD/Pg/Pg.so(+0xe869)[0xb7363869]
/usr/local/lib/perl/5.10.1/auto/DBI/DBI.so(XS_DBI_dispatch+0xffb)[0xb73e114b]
Bucardo VAC.(Perl_pp_entersub+0x52b)[0x80d5ddb]
Bucardo VAC.(Perl_call_sv+0x5a8)[0x8078c18]
Bucardo VAC.(Perl_sv_clear+0xa0)[0x80e80f0]
Bucardo VAC.(Perl_sv_free2+0x4a)[0x80e883a]
Bucardo VAC.[0x80dd939]
Bucardo VAC.(Perl_sv_clean_objs+0x29)[0x80dd999]
Bucardo VAC.(perl_destruct+0x11bf)[0x807dd0f]
Bucardo VAC.(main+0xdb)[0x80642eb]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xb75aaca6]
Bucardo VAC.[0x8064171]

[memory dump portion removed]

Then after 15 minutes after restarting bucardo, the same repeating error. I ran a bucardo reload config, and the errors stopped, but I'll have to check with the db folks as to whether there was any actual data error.

To reiterate for thoroughness:
'local' master, which runs the bucardo controller:
Debian Squeeze 32bit
postgresql 8.4
Bucardo 5.0.0 latest from repo
perl 5.10.1 ('i486')
DBD::Pg    2.19.3
DBIx::Safe    1.2.5

'remote' master:
Debian Squeeze 64bit
postgresql 8.4
Bucardo 5.0.0 latest from repo
perl 5.10.1 (x64)
DBD::Pg    2.19.3
DBIx::Safe    1.2.5

I'm fairly at my wits end. I can't figure out why I have so many issues with this.

--
Paul Theodoropoulos
www.anastrophe.com

_______________________________________________
Bucardo-general mailing list
[email protected]
https://mail.endcrypt.com/mailman/listinfo/bucardo-general

Reply via email to