On 07/05/2011 02:52 PM, jan.gnep...@t-systems.com wrote:
Defining all three server whithin one section in modules/ldap

ldap { server = "<IP ldap-1>   <IP ldap-2>   <IP ldap-3>" .}

And setting just "ldap" within authorize and authenticate:

With this config an other ldap server is choosen, if the one
that has handelt the communication for ldap group extends
fails. But failover took 15 minutes. Thats much too long for
us. (1-3 minutes at most will be acceptable, "zero outage"
gorgeous/expected)

It should not take 15 minutes.

What is your "net_timeout" set to?

net_timeout = 1 timelimit = 2 timeout = 4

For testing i added a hostroute to an other gateway (=host
unreachable)

OK, i tested around with a single ldap section. Setting a route to a
different interface for testing was a bad idea! I watched at the
connections on the ldap port, and made my tests. - I made the first
request (with positive answer) - A connection to one server was opend
and resides "established"! - adding the route for that server to an
other gateway - the established connection is still visible (netstat
-anlp | grep<ldap-server-port>) - all requests for the next 15
minutes fail (server not rachable) - after 15 minutes, the esablished
connection terminates, and a new connection to an other server is
opened. Radius has switched to an other server, and everything went
fine from now on.

I don't understand where that 15 minutes is coming from; unless there's a bug in libldap, the net_timeout should end up being a timeout on the select() call, and should take at most that long to fail.


But i made the same test again, with "tcpkill" from the dsniff
package, instead of setting a route. And with this tests radius
switches imediately to an other server, no request fails! :-)

Now is just unclear, will these tests be representative for real
ldap-server or connection problems?

Not necessarily. You may or may not receive a TCP RST or ICMP error for an existing connection if your LDAP server goes down; you certainly can't rely on it.

Basically, no: not always.

libldap needs to use and honour the timeout value, and I'm not sure why it isn't doing it. I haven't had time to test it yet, but will try later this week.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Reply via email to