Hi Frank - What version of Radiator are you running currently?
Hugh > On 5 Nov 2020, at 04:10, Frank Danielson <[email protected]> wrote: > > Good Day All- > > We’ve been running AuthByLOADBALANCE for some time now and have noticed that > if there is a message that does not get a response from the downstream hosts > that it will be retried infinitely. This not only keeps the message around > forever but as it is tried and failed, it increases the failure counts for > the target hosts which makes them more likely to be marked unavailable and > causes delivery problems with other requests. > > For example a malformed request may be sent by an upstream client and handled > by AuthByLOADBALANCE where the target hosts simply do not respond to the > proxied request because they don’t like it. The request will be retried on > the current host for Retries times by handle_timeout() after which the > request is handed off to failed(), which tracks MaxFailedRequests for the > host and marks it unavailable if applicable and then hands off the request to > forward() which calls chooseHost() to find the next available host. The stock > chooseHost() in AuthByRADIUS tracks if the request has reach the end of the > list or not but chooseHost() in AuthByLOADBALANCE will always return a host > if one is available and it could even be the same host as the last try if > MaxFailedRequests has not been reached for that host. The end result is that > the request will be retried forever and incrementing the failure count for > downstream hosts, causing them to be marked unavailable. > > After some looking at the code I think I could override failed() to track the > number of unique hosts to which a request has been forwarded with something > like > > $fp->{retryHosts}->{$host}++ > > and then add a couple of checks in chooseHost() that are similar to the to > original one- > > if (@{$fp->{retryHosts}} < @{$self->{Hosts}}) > { > foreach $host (@{$self->{Hosts}}) > { > next if ($fp->{retryHosts}->{$host}) > … > > The end result being that the request will be tried for each host in the list > Retries times and then the next best candidate chosen by the volume algorithm > until all hosts are tried and then the request fails. That may not be the > optimal behavior but it beats trying forever. > > Before doing that and bearing the burden of maintaining a custom AuthBy I > figured I’d send it to the list and see if someone else has already solved > this problem or if Open Systems would be willing to revisit the > AuthByLOADBALANCE logic. Perhaps changing the interpretation of Retries to > mean the total number of times a request is retried instead of a per host > number in order to have a finite lifetime on a request? In that case > chooseHost() could be called for each retry in handle_timeout() to increase > the chances of success. > > Regards- > > <image002.png> > > Frank Danielson | S.V.P. Engineering > * [email protected] > > _______________________________________________ > radiator mailing list > [email protected] > https://lists.open.com.au/mailman/listinfo/radiator -- Hugh Irvine [email protected] Radiator: the most portable, flexible and configurable RADIUS server anywhere. SQL, proxy, DBM, files, LDAP, NIS+, password, NT, Emerald, Platypus, Freeside, TACACS+, PAM, external, Active Directory, EAP, TLS, TTLS, PEAP, TNC, WiMAX, RSA, Vasco, Yubikey, MOTP, HOTP, TOTP, DIAMETER, SIM, etc. Full source on Unix, Linux, Windows, macOS, Solaris, VMS, NetWare etc. _______________________________________________ radiator mailing list [email protected] https://lists.open.com.au/mailman/listinfo/radiator
