On Jul 11, 2009, at 2:23 PM, Alan DeKok wrote:

Philip Molter wrote:
I did try that.  It did not do what I was attempting to do.

 Hmm... "it didn't work".

I apologize I was not more specific. The retransmits kept getting sent to the same failed home server rather than the failed home server being marked dead and the retransmits going to a different home server. I have figured out why. The minimum zombie_period is 20, hard-coded in realms.c. The zombie_period of 5 you recommended which I tried was not taking effect, which lead to my 20 second test timeout kicking in before the proxy had waited long enough to actually mark the server as dead (The 5th retransmit would have triggered the failover, but the proxy only got 3 retransmits).

And that does exactly what I want for this case. I can provide a patch that does the following things:

a) allows lower values than 5 for response_window and 20 for zombie_period (I will not change recommendations)
b) makes the post_proxy_fail_handler optional on a pool-by-pool basis

Does that seem acceptable? You seem hesitant to accept a solution that you do not think could be used for more than a few people. This solution is going to be minimally invasive to the code.

Also, is there a config with which the retransmit proxy failover code could actually be triggered without the patch? I cannot see it. Failover only happens after the response_window is exceeded, and if the response_window is exceeded, the original request is replied to with an Access-Reject message, which means any retransmits will be never reach the REQUEST_PROXIED state in received_retransmits() after the response_window is exceeded. Am I reading that correctly?

Does that seem like a method that can work.  Again, not to replace
anything but to supplement it?

 It's an *additional* method, rather than a *better* method.  It
requires additional code, additional state machine checks, and as
such... I'm biased against it. It's just too much like a site- specific
hack for it to be integrated into the main distribution.

Adding a *better* method is the preferred approach. It's OK to change
existing behavior, so long as it's for the better.

You had the retransmit failover code already written. It seems not much needs to be done to allow a pool configuration to continue on after the response_window has been exceeded. Let me submit a patch and you see what you think.

Many NASes can use an internal user cache as a backup to a
non-responding or slowly-responding RADIUS server. If the proxy returns a an actual Access-Reject, the NAS accepts that and says the request is
invalid.  If the proxy returns nothing, the NAS can say, "Well, my
RADIUS server is down, but I have this record for the same user/ pass in
my cache and previously, it received an Access-Accept.  Let me accept
this request." That does not break any RADIUS specifications I know of,

 Nonsense.  NASes that cache authentication credentials are
*completely* outside of the RADIUS specification.  It's like adding a
jet engine to a car.  There's no law *preventing* it, so it must be
legal, right?

Well, there's nothing in the RADIUS specification that describes or even recommends how a lack of response must be handled by the NAS. You make it sound as if the NAS is doing something illegal by using a previous cached accept. It's not. The NAS can implement whatever logic it wants, and that particular feature is one that leads to a better user experience. Just because you think a failure-to-contact is the same as a denial does not mean that other vendors have not come up with solutions that can work around it.

RFC 2607 is clear that the proxy should not respond to the client unless it receives a reply from the home server. At the very least, returning a rejection is not an accurate portrayal of the state of the authentication. It would be a better representation to just let it timeout, but I understand returning the rejection so that the NAS can short-circuit more quickly the transaction.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Reply via email to