On 25 May 2011 12:43, Ger Timmens <[email protected]> wrote: > Alternatively, this might occur because the slon for this node > has been broken for a long time, and there are an enormous number of > entries in sl_event on this or other nodes for the node to work > through, and it is taking more than slon_conf_remote_listen_timeout > seconds to run the query. In older versions of Slony-I, that > configuration parameter did not exist; the timeout was fixed at 300 > seconds. In newer versions, you might increase that timeout in the > slon config file to a larger value so that it can continue to > completion. And then investigate why nobody was monitoring things > such that replication broke for such a long time...
If this is the case, then you can change the listen timeout to something in the hundreds of seconds. > Replication seems to continue fine after this error. > Is it save to continue ? > Or should we start from scratch ? > If so what do we have to do to prevent this error from happening again ? In general, Slony will not allow slaves to enter an inconsistent state. Look at the "test_slony_state" Perl script which looks at various parts of the configuration and verifies that things are running correctly: http://slony.info/documentation/2.0/monitoring.html This should form part of your monitoring setup. It is common to automatically run the script at regular intervals. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services _______________________________________________ Slony1-general mailing list [email protected] http://lists.slony.info/mailman/listinfo/slony1-general
