Hi all,

I have a similar problem on a machine with CentOS 5 Update 4. The freeradius packages which I use are taken from jdennis repository (http://people.redhat.com/jdennis/freeradius-rhel-centos/)
Packages versions are:
freeradius2-utils-2.1.8-2.el5
freeradius2-2.1.8-2.el5
freeradius2-mysql-2.1.8-2.el5
freeradius2-perl-2.1.8-2.el5


I configured two different freeradius servers on the same machine. Compared to the default configuration, the one radius has the addition of using sqlippool and the second radius calls an external perl script which in turns connects via ssh to another server and then runs some other scripts on it.

The first radius with the sqlippool runs for more than 2 months without any problem at all.

The second one which calls an external perl script hungs after a few hours.
When I issue the status command the result is
# /etc/init.d/radiusd status
radiusd dead but pid file exists

I configured freeradius to call the script with the perl module as well as using the exec module with identical results. After the radius stops, I see that the perl script log file stops always when it tries to ssh to the other server. From the other server statistics it doesn't show anything unusual (e.g. high cpu) or any error in the ssh log file. Of course, the issue is not that there is some problem with the perl script, the ssh command or the remote server but that the radius hungs when the external script which calls hungs.

Note that this behavior can be reproduced by calling an external script like the following
#!/usr/bin/perl

use strict;

my $rc=system("ssh 1.1.1.1");
exit($rc);
1.1.1.1 is just an IP address that would cause the ssh to timeout.
Note that the freeradius server does not hang when started in debug mode.

We use exactly the same perl script for the last few years without any problem on another machine which runs freeradius version 1.0.1.

Regards,
Stylianos


On 16/4/2010 1:05 μμ, Alan DeKok wrote:
Jakob Hirsch wrote:
we have freeradius 2.1.8 running on a couple of servers and are very
happy with it. But every few days FR crashes on one of the servers (a
random one, not always the same). The load is significant (average 150
requests/s per server, 400/s peak) but sureley not too high. So
everything seems to run fine besides the annoying crashes, which alarms
people and make the weekly availibility reports look bad (even though FR
is restarted automatically, of course). The previous 1.1.8 installation
we upgraded 6 months ago from did not have this problem.
   Hmm... I've run it at 20K pps for *days*....

Anyways, I really want to find out what's going wrong, so I wanted to
get core dumps of these crashes. Only that I just don't get them.
- radiusd.conf has allow_core_dumps = yes (and FR says "Info: Core dumps
are enabled." at startup)
- /proc/sys/kernel/core_pattern is set to '/tmp/core.%t.%e.%p', so core
dumps can be written to disk (tested with a little programm that forces
a segfault)
- I put "ulimit -c unlimited" in the startup script.
cat /proc/$(pidof freeradius)/limits shows "unlimited" for soft and hard
limit of "Max core file size"
   Often 'root' can't core dump, and programs that change uid can't core
dump.  It's hard to know what's going on with the OS.

So what's missing? The only indication of the crash is this line in syslog:

Apr 10 17:57:19 xxxxxxxx kernel: [12268615.000288] freeradius[14846]: segfault 
at 73818 ip 00007f0cb40e875e sp 00007fff9c6304c0 error 4 in 
libfreeradius-radius-2.1.8.so[7f0cb40d1000+1f000]
(This is debian lenny x86_64, btw.)

Any hints?
   doc/bugs.  You'll need symbols to find out what's going on.

   Alan DeKok.


-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Reply via email to