Hi Frank -

I know Heikki has been looking at related issues, so I’ll let him follow up 
with you.

cheers

Hugh


> On 6 Nov 2020, at 02:43, Frank Danielson <[email protected]> wrote:
> 
> Hi Hugh-
> 
> I’m running an older version, 4.7 but did look at the code for 
> AuthByLOADBALANCE and it does not seem to have changed in the latest version. 
> If there’s been some other changes in the retry behavior in AuthByRADIUS 
> we’ll schedule the update and see what happens. A very cursory look at the 
> code seems that the underlying logic is the same, AuthByRADIUS depends on 
> chooseHost() to return no host available and as long as it supplies one then 
> the request will keep retrying. The one exception is that if all target hosts 
> have been marked as down then the AuthByLOADBALANCE chooseHost() logs 
> "ProxyAlgorithm LOADBALANCE Could not find a working host to proxy to” and 
> the request stops retrying.
> 
> Regards-
> 
> <image002.png>
> 
> Frank Danielson | S.V.P. Engineering
> * [email protected]
> 
>> On Nov 4, 2020, at 4:46 PM, Hugh Irvine <[email protected]> wrote:
>> 
>> 
>> Hi Frank -
>> 
>> What version of Radiator are you running currently?
>> 
>> Hugh
>> 
>> 
>>> On 5 Nov 2020, at 04:10, Frank Danielson <[email protected]> wrote:
>>> 
>>> Good Day All-
>>> 
>>> We’ve been running AuthByLOADBALANCE for some time now and have noticed 
>>> that if there is a message that does not get a response from the downstream 
>>> hosts that it will be retried infinitely. This not only keeps the message 
>>> around forever but as it is tried and failed, it increases the failure 
>>> counts for the target hosts which makes them more likely to be marked 
>>> unavailable and causes delivery problems with other requests.
>>> 
>>> For example a malformed request may be sent by an upstream client and 
>>> handled by AuthByLOADBALANCE where the target hosts simply do not respond 
>>> to the proxied request because they don’t like it. The request will be 
>>> retried on the current host for Retries times by handle_timeout() after 
>>> which the request is handed off to failed(), which tracks MaxFailedRequests 
>>> for the host and marks it unavailable if applicable and then hands off the 
>>> request to forward() which calls chooseHost() to find the next available 
>>> host. The stock chooseHost() in AuthByRADIUS tracks if the request has 
>>> reach the end of the list or not but chooseHost() in AuthByLOADBALANCE will 
>>> always return a host if one is available and it could even be the same host 
>>> as the last try if MaxFailedRequests has not been reached for that host. 
>>> The end result is that the request will be retried forever and incrementing 
>>> the failure count for downstream hosts, causing them to be marked 
>>> unavailable. 
>>> 
>>> After some looking at the code I think I could override failed() to track 
>>> the number of unique hosts to which a request has been forwarded with 
>>> something like 
>>> 
>>> $fp->{retryHosts}->{$host}++
>>> 
>>> and then add a couple of checks in chooseHost() that are similar to the to 
>>> original one-
>>> 
>>> if (@{$fp->{retryHosts}} < @{$self->{Hosts}}) 
>>> {
>>> foreach $host (@{$self->{Hosts}})
>>> {
>>>  next if ($fp->{retryHosts}->{$host})
>>>  …
>>> 
>>> The end result being that the request will be tried for each host in the 
>>> list Retries times and then the next best candidate chosen by the volume 
>>> algorithm until all hosts are tried and then the request fails. That may 
>>> not be the optimal behavior but it beats trying forever.
>>> 
>>> Before doing that and bearing the burden of maintaining a custom AuthBy I 
>>> figured I’d send it to the list and see if someone else has already solved 
>>> this problem or if Open Systems would be willing to revisit the 
>>> AuthByLOADBALANCE logic. Perhaps changing the interpretation of Retries to 
>>> mean the total number of times a request is retried instead of a per host 
>>> number in order to have a finite lifetime on a request? In that case 
>>> chooseHost() could be called for each retry in handle_timeout() to increase 
>>> the chances of success.
>>> 
>>> Regards-
>>> 
>>> <image002.png>
>>> 
>>> Frank Danielson | S.V.P. Engineering
>>> * [email protected]
>>> 
>>> _______________________________________________
>>> radiator mailing list
>>> [email protected]
>>> https://lists.open.com.au/mailman/listinfo/radiator
>> 
>> 
>> --
>> 
>> Hugh Irvine
>> [email protected]
>> 
>> Radiator: the most portable, flexible and configurable RADIUS server 
>> anywhere. SQL, proxy, DBM, files, LDAP, NIS+, password, NT, Emerald, 
>> Platypus, Freeside, TACACS+, PAM, external, Active Directory, EAP, TLS, 
>> TTLS, PEAP, TNC, WiMAX, RSA, Vasco, Yubikey, MOTP, HOTP, TOTP,
>> DIAMETER, SIM, etc. 
>> Full source on Unix, Linux, Windows, macOS, Solaris, VMS, NetWare etc.
>> 
> 


--

Hugh Irvine
[email protected]

Radiator: the most portable, flexible and configurable RADIUS server 
anywhere. SQL, proxy, DBM, files, LDAP, NIS+, password, NT, Emerald, 
Platypus, Freeside, TACACS+, PAM, external, Active Directory, EAP, TLS, 
TTLS, PEAP, TNC, WiMAX, RSA, Vasco, Yubikey, MOTP, HOTP, TOTP,
DIAMETER, SIM, etc. 
Full source on Unix, Linux, Windows, macOS, Solaris, VMS, NetWare etc.

_______________________________________________
radiator mailing list
[email protected]
https://lists.open.com.au/mailman/listinfo/radiator

Reply via email to