On 05/02/13 10:20, Antonio Alberola wrote:
Dear All,

I'm having random authentication failures and I think they are due to a
Radius server internal failure. I use Radius for authenticating the email of
users in Windows Active Directory via PAM. Before I used NTLM and Kerberos
together, and now I use PAM.

This is confusing. FreeRADIUS is calling the "pam" module, yes? So what is the PAM stack calling?

The problem is as follows. Users authenticate properly during the whole day,
but suddenly authentication begins to fail and user authentication error
appears even if the credentials are right. Since the failure, the service is
exponentially degrade and it only validates 1 of every 20 requests. The
onset of failure seems to coincide with one of these three messages:

Those messages are a symptom; your PAM module is taking too long to respond. You need to investigate what the PAM stack is calling, why it is hanging, and how to reduce the timeouts or improve the speed of failure detection.

This is not a FreeRADIUS problem.


Tue Jan 30 08:27:38 2013 : Error: Received conflicting packet from client
localhost port 14038 - ID: 194 due to unfinished request 161451.  Giving up
on old request.
Tue Jan 30 08:27:52 2013 : Error: Request 161507 has been waiting in the
processing queue for 11 seconds.  Check that all databases are running
properly!
Fri Feb  1 14:55:15 2013 : Info: WARNING: Child is hung for request 3609 in
component <core> module <queue>.

The solution we are applying at the moment is restarting Radius. Sometimes
restarting does not fix the problem and we have to set Radius for allowing
all connections. Few minutes later, we turn it back to the current
configuration and it works again. The biggest drawback, besides annoyance of
users, is Windows AD accounts are blocked because of the failures.

I need help to find the cause of the problem and fix it. I do not know yet
if the problem is in the domain controllers, in the PAM module or in Radius.
But everything seems to point to Radius.

In short: the problem you are experiencing with FreeRADIUS is because your authentication mechanism (PAM) is taking too long to respond. This is consuming all threads in the pool, which explains the log messages you see.

Fix the PAM stack to fail over properly, and this problem will go away.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Reply via email to