Vinay Wagh wrote > Which is why it took me some time to figure this out. What I did was > added debug code in rbtree_insert to print out contents of the node if a > duplicate node existed. In the logs I saw that the node had the same > state but a different identity.
Ok. That shouldn't be happening. It may be an internal race condition in the server. > The reason I started debugging this problem is because I started getting > Access-Reject without RADIUS_ATTR_MESSAGE_AUTHENTICATOR which is a > seperate attribute in the radius message. Yes... Please also use the common name "Message-Authenticator". RADIUS_ATTR_MESSAGE_AUTHENTICATOR is an implementation-specific name on your system. > I also observed that some > reply's from the radius server had this field but it did not match the > authenticator in the original request. That statement makes no sense. The Message-Authenticator attribute is not supposed to be matched with anything. Are you saying that it fails validation on the client? > Then I tried to link this to the > problem I found and I think it is possible if we generate the same > state. Assume that after the server sends the access challenge the > radius server fails to insert the handler because there is already a > duplicate and then before it gets rid of the duplicate handler the > client replies. In this case the radius server will try to look for the > handler and actually find it since the id, ip addr and state is the same > but the identity is different. It can then use that context to reply to > this request in which case the fields may not match. If the radius > server had already replied to the duplicate handler then it will not > find the handler that for our current request and send an Access-Reject. OK... > So whether we get an Access-Reject or a reply with invalid message > authenticator depends on the timing of whether the radius server still > has the duplicate context or not. Is that possible though ? I can understand why it would send an Access-Reject. I don't understand why it would reply with an invalid Message-Authenticator. The calculation for Message-Authenticator is done in src/lib/radius.c, which is independent of any issues in rlm_eap. > Thats true, but how do we deal with this as a general solution ? If the > product we build needs to sit in a network where they have deployed free > radius we cannot modify their code and will need a solution from the > freeradius community. I seems like we need to investigate the > generate_state() function and see if we really generate random states. If it's a bug in FreeRADIUS, it needs to be fixed. The fact that this is seen only for 1000 EAP packets/s indicates it might be a race condition. i.e. there are few systems currently handling 1000 EAP packets/s. Even in your test, I suspect it's doing EAP-MD5. If not, it's doing another non-TLS EAP method. If it is doing a TLS method, then either it has hardware acceleration, or you have *huge* amounts of CPU power available to the RADIUS server. > I have uploaded the log file you can download it at > http://download.yousendit.com/D1B4E30A06784505 (its 3MB and I was not > sure if I would get flamed for attaching it, it will be available for 14 > days at this location). Here are the important line numbers Ok. As a potential work-around, try editing src/include/libradius.h. Look for the lrad_randctx structure, and change every entry from: uint32_t ... to uint32_t volatile ... That *might* help. If it does, I'll commit the change. Alan DeKok. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html