> > 1. How is my data? Do I need to re-sync? > > Possible. Check your data :)
Only a few hundred million rows... I better get started :-) > > 2. How can I prove that this problem is related to threading issue? > > I don't think it is related to threading issue. > > If you have had more than 2G (_xxx_cluster_.sl_log_1.log_xid > 2G) > transactions executed during the replication, without reindexing > sl_log_1, then indexes on xxid starts misbehaving, resulting both in > duplicate key errors *and* some events not being replicated (i.e. data > loss). This could be it. The problem has occurred three times, all after adding a new table which took some time to COPY and create indexes, but there were no pending events when the COPY started and it caught up quickly after the addition/merge was complete. I very much doubt we've done 2G transactions yet, as this is a new cluster with only master->slave replication. I would estimated ~40 million transactions. > If you want to know a little more about the issue look for my recent > posts on this list. I will read this and continue to investigate. Thanks, Michael _______________________________________________ Slony1-general mailing list [email protected] http://gborg.postgresql.org/mailman/listinfo/slony1-general
