I have a single master, ten slave system. The replication lag has been
steadily increasing over time. Let me clarify that - the replication is
still happening but the average slave which started out 0-10 seconds
behind in lag, is now about 30 - 2mins behind. When the lag reaches
about two minutes it is reset to 30 seconds, not back to zero.. The
postgres database is going very slowly, whether slony causes that or is
affected by it I'm not sure. The slony replication lag is a very good
indication of how fast the system is running. There are very few changes
on the master node (maybe 10-20 per minute), so it is not sheer volume
of load slowing everything down, it is each postgres query. Also the
master machine has about 40% cpu idle so is not maxxed out doing other
jobs.

sl_event, sl_confirm and sl_log_1 are all reasonably sized at less than
2000 entries each. We have a vacuum routine which does vacuum analyze on
all databases once an hour. Keeping an eye on the replication status I
have found that a vacuum makes no difference, neither does a vacuum
analyze. However, a vacuum full makes everything suddenly super fast.
All the lags go back to 0-10 seconds. We don't normally run a vacuum
full because I thought it only recovered disk space, and the locking has
caused us issues on the past. At first I thought this one time vacuum
full on a system that never runs one was the magic cure, but three hours
later and the lags are all increasing again. Another vacuum full sets
everything right again.

Has anyone else found that they need a vacuum full to keep their
replication, and/or general postgres use up to speed? Have I
mis-understood the use of vacuum full, and does it do more than just
recover disk space? Next time it goes wrong I will do vacuum full on a
table at a time to see if I can narrow down the culprit. 

Any help appreciated.

Vicki


This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately.
Statements of intent shall only become binding when confirmed in hard copy 
by an authorized signatory.

_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Reply via email to