> > Try: > > > > echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time > > echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl > > echo 10 > /proc/sys/net/ipv4/tcp_keepalive_probes > > > > This will send the first keepalive after 5 minutes, and then every 60 > > seconds after that, and will drop the connection if no response is seen > > from 10 consecutive probes. > > > > The default is that the keepalive won't start for 2 hours (7200 seconds) > > after the connection has been idle. Not much good in your case. > > Well life is not all that simple. You haven't mentioned what *all* the > details happens when you set the default keepalive from 2 hours down to > 300 > seconds, which could be fatal to many programs. I haven't looked at the > details of this for the last 7 years, but if I recall correctly, once the > keepalive time has expired, the OS will attempt to contact the other end, > which if keepalive has not been enabled by the application, will cause the > line to terminate.
Do you have a source for this? I'm pretty sure that you aren't recalling correctly. I think that those keepalive settings only come into effect if SO_KEEPALIVE has been set by the application. > keepalive was not designed to refresh routers but to ensure that inactive > dead connections in an OS are eventually detected and closed. Correct, but I have successfully used it at a client site when a Cisco router was dropping connections prematurely. The original scenario was: PC --- router --- internet --- router --- Server Users local to the server were fine, but users on the remote PC's would go away from their desks, come back, and as soon as they hit a key they would get the "Connection Closed" message. Both the client and server software were closed source and with no option to enable keepalives, so what I did was this: PC --- router --- internet --- router --- Server | |- Linux Server (if the ASCII art comes out all wrong, it should look like the Linux Server is connected to the lan segment between the Server and the Router) The Linux server ran 'simpleproxy' (I think) which was literally just a program that accepted a TCP connection from the PC, created a connection to the Server, and forwarded packets between them. It didn't support SO_KEEPALIVE initially, but that was pretty easy to add, and once added, all the problems went away! > Routers should keep > all lines open for a minimum of 3 hours of idle time (IMO). I wonder if there is an RFC for this... I know I'm being pedantic here, but routers do not (should not) track connections here, it is firewalls that will track connections, time them out, and then treat subsequent packets on those connections as 'wtf is this packet?' and send reset/unreachable responses. > > > > Does Bacula include any application level keepalives? > > Bacula sets keepalive on all its sockets when they are opened. > > > There should be no > > need to do this if you've set /proc and setsockopt correctly, unless the > > respective daemons implement their own application level timeouts. > > Yes, providing you don't mind prematurely killing off non-keepalive > programs > that are inactive during the reduced keepalive period you have set. This should be relatively easy to test... assuming we can't find a document somewhere that clarifies it one way or another. James ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users