On 6-Feb-07, at 1:37 PM, Dan Falconer wrote:
...
>       I'm not trying to blame-shift here, but it really seems like the  
> lag is
> generated from Slony itself.  There's a stale connection, initiated  
> by Slony
> (that's the only thing that connects on the private 192.168.1.x  
> network),
> which sits idle in transaction.  On the master, the query shows as  
> "fetch 100
> from LOG;" and shows on the slave as "<IDLE> in transaction".
Just yesterday, our replication was behind by 18 million rows in  
sl_log_1 and 8 million rows in sl_log_2. This is on Slony-I 1.2.6 and  
PostgreSQL 8.2.1. A slony postgres process was taking up 100%  
processor (more specifically, one core of one processor) on the  
master db. Doing a netstat -a on the master db showed a non-slony  
postgres process in a CLOSE_WAIT state corresponding to a query in  
"<IDLE> in transaction". Once I got rid of the hung connection, slony  
caught up within 40 minutes. I suppose this might be similar to what  
occurs when there is a long running query.

I'm not sure if this is pertinent to your problem, but I thought I'd  
forward my experience in case it helps as the problems seem similar.

I am going to add a watchdog to check for network connections stuck  
in a CLOSE_WAIT state to hopefully catch this in the future.

Brian Wipf
<[EMAIL PROTECTED]>

_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Reply via email to