Hi Frank - I know Heikki has been looking at related issues, so I’ll let him follow up with you.
cheers Hugh > On 6 Nov 2020, at 02:43, Frank Danielson <[email protected]> wrote: > > Hi Hugh- > > I’m running an older version, 4.7 but did look at the code for > AuthByLOADBALANCE and it does not seem to have changed in the latest version. > If there’s been some other changes in the retry behavior in AuthByRADIUS > we’ll schedule the update and see what happens. A very cursory look at the > code seems that the underlying logic is the same, AuthByRADIUS depends on > chooseHost() to return no host available and as long as it supplies one then > the request will keep retrying. The one exception is that if all target hosts > have been marked as down then the AuthByLOADBALANCE chooseHost() logs > "ProxyAlgorithm LOADBALANCE Could not find a working host to proxy to” and > the request stops retrying. > > Regards- > > <image002.png> > > Frank Danielson | S.V.P. Engineering > * [email protected] > >> On Nov 4, 2020, at 4:46 PM, Hugh Irvine <[email protected]> wrote: >> >> >> Hi Frank - >> >> What version of Radiator are you running currently? >> >> Hugh >> >> >>> On 5 Nov 2020, at 04:10, Frank Danielson <[email protected]> wrote: >>> >>> Good Day All- >>> >>> We’ve been running AuthByLOADBALANCE for some time now and have noticed >>> that if there is a message that does not get a response from the downstream >>> hosts that it will be retried infinitely. This not only keeps the message >>> around forever but as it is tried and failed, it increases the failure >>> counts for the target hosts which makes them more likely to be marked >>> unavailable and causes delivery problems with other requests. >>> >>> For example a malformed request may be sent by an upstream client and >>> handled by AuthByLOADBALANCE where the target hosts simply do not respond >>> to the proxied request because they don’t like it. The request will be >>> retried on the current host for Retries times by handle_timeout() after >>> which the request is handed off to failed(), which tracks MaxFailedRequests >>> for the host and marks it unavailable if applicable and then hands off the >>> request to forward() which calls chooseHost() to find the next available >>> host. The stock chooseHost() in AuthByRADIUS tracks if the request has >>> reach the end of the list or not but chooseHost() in AuthByLOADBALANCE will >>> always return a host if one is available and it could even be the same host >>> as the last try if MaxFailedRequests has not been reached for that host. >>> The end result is that the request will be retried forever and incrementing >>> the failure count for downstream hosts, causing them to be marked >>> unavailable. >>> >>> After some looking at the code I think I could override failed() to track >>> the number of unique hosts to which a request has been forwarded with >>> something like >>> >>> $fp->{retryHosts}->{$host}++ >>> >>> and then add a couple of checks in chooseHost() that are similar to the to >>> original one- >>> >>> if (@{$fp->{retryHosts}} < @{$self->{Hosts}}) >>> { >>> foreach $host (@{$self->{Hosts}}) >>> { >>> next if ($fp->{retryHosts}->{$host}) >>> … >>> >>> The end result being that the request will be tried for each host in the >>> list Retries times and then the next best candidate chosen by the volume >>> algorithm until all hosts are tried and then the request fails. That may >>> not be the optimal behavior but it beats trying forever. >>> >>> Before doing that and bearing the burden of maintaining a custom AuthBy I >>> figured I’d send it to the list and see if someone else has already solved >>> this problem or if Open Systems would be willing to revisit the >>> AuthByLOADBALANCE logic. Perhaps changing the interpretation of Retries to >>> mean the total number of times a request is retried instead of a per host >>> number in order to have a finite lifetime on a request? In that case >>> chooseHost() could be called for each retry in handle_timeout() to increase >>> the chances of success. >>> >>> Regards- >>> >>> <image002.png> >>> >>> Frank Danielson | S.V.P. Engineering >>> * [email protected] >>> >>> _______________________________________________ >>> radiator mailing list >>> [email protected] >>> https://lists.open.com.au/mailman/listinfo/radiator >> >> >> -- >> >> Hugh Irvine >> [email protected] >> >> Radiator: the most portable, flexible and configurable RADIUS server >> anywhere. SQL, proxy, DBM, files, LDAP, NIS+, password, NT, Emerald, >> Platypus, Freeside, TACACS+, PAM, external, Active Directory, EAP, TLS, >> TTLS, PEAP, TNC, WiMAX, RSA, Vasco, Yubikey, MOTP, HOTP, TOTP, >> DIAMETER, SIM, etc. >> Full source on Unix, Linux, Windows, macOS, Solaris, VMS, NetWare etc. >> > -- Hugh Irvine [email protected] Radiator: the most portable, flexible and configurable RADIUS server anywhere. SQL, proxy, DBM, files, LDAP, NIS+, password, NT, Emerald, Platypus, Freeside, TACACS+, PAM, external, Active Directory, EAP, TLS, TTLS, PEAP, TNC, WiMAX, RSA, Vasco, Yubikey, MOTP, HOTP, TOTP, DIAMETER, SIM, etc. Full source on Unix, Linux, Windows, macOS, Solaris, VMS, NetWare etc. _______________________________________________ radiator mailing list [email protected] https://lists.open.com.au/mailman/listinfo/radiator
