Re: [Slony1-general] slon process "stops" after 58 minutes

thorkill Fri, 14 Mar 2014 07:32:34 -0700

Hi all,

Vick Khera <[email protected]> writes:


> So in prep for upgrading from 2.1 to 2.2 this weekend, I upgraded my
> server's OS from FreeBSD 9.1 to 9.2 (a fairly minor update, as OS updates
> go).
>
> Since the upgrade, the slon connected to the replica DB on that upgraded
> server will stop after just about 58 to 59 minutes. Restarting the slon
> daemon allows the replication to continue and fairly quickly catch up.

I have similar problems. It is 45 minutes +/- remote_listen_timeout for me.

[...]

I my case it's "master" on FreeBSD 9.2 and a "slave" on FreeBSD 9.1. It
seems that in some time around net.inet.tcp.keepidle (sysctl -a to
check) when keepalive from OS kicks in the transsmission will be droped
depending on your keepalive configuration. You should also see error
messages in postgresql logs like:

getsockopt(TCP_KEEPCNT) failed: Protocol not available

> Any ideas? This is so confusing because it is such an odd time interval
> before it locks up. What's magical about 58 minutes?

My wild guess is:

http://lists.freebsd.org/pipermail/freebsd-stable/2013-November/075781.html


The problem I have is, that slon on slave reconnects to database on
master but the backend process on master stays forever taking
resournces. From first connection I have like 1 hour "in
production" - everything is just fine, then comes keepalive + 
remote_listen_timeout, drops the connection and slon reconnects every
remote_listen_timeout creating new backend processes.

I have one simple workaround for this: crontab + slon restart every
hour.

Cheers,
thorkill

PS. Postgresql 9.3.2 used on both nodes.

signature.asc
Description: PGP signature

_______________________________________________
Slony1-general mailing list
[email protected]
http://lists.slony.info/mailman/listinfo/slony1-general

Re: [Slony1-general] slon process "stops" after 58 minutes

Reply via email to