[Bucardo-general] Hung problem..

Michelle Sullivan Tue, 15 Apr 2014 04:17:36 -0700

I am continually seeing my Bucardo hanging up... when it does it looks
like this:


 5487 ?        S      0:00 Bucardo Kid. Sync "rt4db"
 5488 ?        Ss     0:01 postgres: bucardo bucardo [local] notify
interrupt
16485 ?        Ss     0:00 postgres: pgsql bucardo [local] idle
19100 pts/0    S+     0:00 grep -i buc
20179 ?        S      0:02 Bucardo Kid. Sync "dnsmm"
20180 ?        Ss     0:03 postgres: bucardo bucardo [local] idle
25607 ?        S      0:10 Bucardo Kid. Sync "rt4seq"
25609 ?        Ss     0:04 postgres: bucardo bucardo [local] notify
interrupt
25614 ?        S      0:11 Bucardo Kid. Sync "sorbsmmseq"
25615 ?        Ss     0:11 postgres: bucardo bucardo [local] notify
interrupt
25617 ?        S      0:01 Bucardo Kid. Sync "rt3db"
25618 ?        Ss     0:09 postgres: bucardo bucardo [local] notify
interrupt
25621 ?        S      0:12 Bucardo Kid. Sync "rt3seq"
25622 ?        Ss     0:06 postgres: bucardo bucardo [local] notify
interrupt
25629 ?        S      0:06 Bucardo Kid. Sync "rt4db"
25630 ?        Ss     0:18 postgres: bucardo bucardo [local] notify
interrupt
30978 ?        S      0:01 Bucardo Kid. Sync "sessions"
30979 ?        Ss     0:09 postgres: bucardo bucardo [local] notify
interrupt


Kill -9 on both the DB process and the Bucardo Kid is the only way to
resolve it... any clues as to what is wrong?

Killing only the DB processes (or restarting the DB with -m immediate)
leaves the Kids hanging indefinitely.

My suspicion of the initial cause is that the interconnect between DBs
is going away silently (and returning as it's a VPN between data
centers..) but how to stop bucardo failing to recover?  What's causing
the lockup inside bucardo..?

DB is Pg 8.4.10 on all multi-master nodes on CentOS, bucardo version
4.99.10.

Regards,

Michelle

_______________________________________________
Bucardo-general mailing list
[email protected]
https://mail.endcrypt.com/mailman/listinfo/bucardo-general

[Bucardo-general] Hung problem..

Reply via email to