Re: Server instability

2007-10-30 Thread Alan DeKok
Nicolai Tejlgaard Hansen wrote:
 I'm having the exact same problem as described below, with Freeradius


 1.7 hanging at 99 percent. Also using PEAP, MSCHAPV2, and eDir, and
 running 1.7 on a SLES 10 SP1.
 I have been using the same configuration since 1.3 without any problems
 problems, but since upgrading from 1.6 to 1.7 it's crashed 3 times
 within a month.

  As noted, it's 1.1.x, not 1.x.  But in any case, crashes are bad.

  It's difficult to know what changed from 1.1.3 to 1.1.7 that causes
the problem.  My suggestion, if you're willing to experience, is to
build a version of 1.1.7 with the rlm_ldap module from 1.1.3.

  i.e.

$ rm freeradius-1.1.7/src/modules/rlm_ldap/*
$ cp freeradius-1.1.3/src/modules/rlm_ldap/*
freeradius-1.1.7/src/modules/rlm_ldap
$ cd freeradius-1.1.7
$ ./configure ...
$ make

  Some things in rlm_ldap changed between the two versions.  This test
will let us know if the issue is in rlm_ldap, or elsewhere.

  If it still crashes, tell us.  If it doesn't crash any more, please
tell us, too!

 As per Phil Mayers request, I recompiled with the developer option (and
 --edir as I am using that). The following is the output:

  That doesn't give anything overly suspicious.  Oh well...

  I think I'll have to get a Mac with Leopard, and start using DTrace to
debug these kinds of problems.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Server instability

2007-10-29 Thread Nicolai Tejlgaard Hansen
I'm having the exact same problem as described below, with Freeradius
1.7 hanging at 99 percent. Also using PEAP, MSCHAPV2, and eDir, and
running 1.7 on a SLES 10 SP1.
I have been using the same configuration since 1.3 without any problems
problems, but since upgrading from 1.6 to 1.7 it's crashed 3 times
within a month.

This is the last line of my log files each time it hangs at 99
percent:
usr/local/var/log/radius/radius.log-20070928.bz2:Thu Sep 27 16:32:23
2007 : Error: Discarding duplicate request from client 3WXM-2:20006 -
ID: 247 due to unfinished request 23061
/usr/local/var/log/radius/radius.log-20071010.bz2:Sun Oct  7 21:19:47
2007 : Error: Discarding duplicate request from client 3WXM-1:20002 -
ID: 188 due to unfinished request 60866
/usr/local/var/log/radius/radius.log-20071024.bz2:Wed Oct 24 08:10:05
2007 : Error: Discarding duplicate request from client 3WXM-2:20008 -
ID: 211 due to unfinished request 94412

As per Phil Mayers request, I recompiled with the developer option (and
--edir as I am using that). The following is the output:

Attaching to program: /usr/local/sbin/radiusd, process 23141
Loaded symbols for /usr/local/sbin/radiusd
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread -1212410192 (LWP 23141)]
[New Thread -1272144992 (LWP 3414)]
[New Thread -1263752288 (LWP 1784)]
[New Thread -1255359584 (LWP 1757)]
[New Thread -1246966880 (LWP 23146)]
[New Thread -1238574176 (LWP 23145)]
[New Thread -1230181472 (LWP 23144)]
[New Thread -1221788768 (LWP 23143)]
[New Thread -1213396064 (LWP 23142)]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /usr/local/lib/libradius-1.1.7.so...done.
Loaded symbols for /usr/local/lib/libradius-1.1.7.so
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /usr/lib/libltdl.so.3...done.
Loaded symbols for /usr/lib/libltdl.so.3
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libssl.so.0.9.8...done.
Loaded symbols for /usr/lib/libssl.so.0.9.8
Reading symbols from /usr/lib/libcrypto.so.0.9.8...done.
Loaded symbols for /usr/lib/libcrypto.so.0.9.8
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /usr/local/lib/rlm_exec-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_exec-1.1.7.so
Reading symbols from /usr/local/lib/rlm_expr-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_expr-1.1.7.so
Reading symbols from /usr/local/lib/rlm_pap-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_pap-1.1.7.so
Reading symbols from /usr/local/lib/rlm_mschap-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_mschap-1.1.7.so
Reading symbols from /usr/local/lib/rlm_ldap-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_ldap-1.1.7.so
Reading symbols from /usr/lib/libldap_r-2.3.so.0...done.
Loaded symbols for /usr/lib/libldap_r-2.3.so.0
Reading symbols from /usr/lib/liblber-2.3.so.0...done.
Loaded symbols for /usr/lib/liblber-2.3.so.0
Reading symbols from /usr/lib/libsasl2.so.2...done.
Loaded symbols for /usr/lib/libsasl2.so.2
Reading symbols from /usr/local/lib/rlm_unix-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_unix-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap-1.1.7.so
Reading symbols from /usr/local/lib/libeap-1.1.7.so...done.
Loaded symbols for /usr/local/lib/libeap-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap_md5-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap_md5-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap_leap-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap_leap-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap_gtc-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap_gtc-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap_tls-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap_tls-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap_ttls-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap_ttls-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap_peap-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap_peap-1.1.7.so
Reading symbols from /usr/local/lib/rlm_eap_mschapv2-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_eap_mschapv2-1.1.7.so
Reading symbols from /usr/local/lib/rlm_realm-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_realm-1.1.7.so
Reading symbols from /usr/local/lib/rlm_preprocess-1.1.7.so...done.
Loaded symbols for /usr/local/lib/rlm_preprocess-1.1.7.so
Reading symbols from 

Re: Server instability

2007-10-29 Thread A . L . M . Buxey
Hi,

 I'm having the exact same problem as described below, with Freeradius
 1.7 hanging at 99 percent. Also using PEAP, MSCHAPV2, and eDir, and
 running 1.7 on a SLES 10 SP1.
 I have been using the same configuration since 1.3 without any problems
 problems, but since upgrading from 1.6 to 1.7 it's crashed 3 times
 within a month.

these errors are solely because the RADIUS doesnt get a response
from your eDirectory fast enough. please also note there is no
1.6 or 1.7 release.  its 1.1.6 and 1.1.7  (theres a vast difference
when people start throwing random numbers around).  if things
workes with 1.1.3 you need to check your config and logs to
find out why 1.1.6/1.1.7 arent as spritely. how often are
you talking to eDirectory? are you really needing to check pre-auth,
auth and post-auth for example?  are you relying on some MySQL
or postgres accounting?  It may well be that the eDirectory is
100% fine but the hold-up is with accounting packet handling.
or, more obviously, more people are using your service and you dont
have enough RADIUS threads running to handle the requests as you only
have eg 10 threads and they are already busy talking to eDirectory
as fast as they can


alan
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Svar: Re: Server instability

2007-10-29 Thread Nicolai Tejlgaard Hansen
Hi
 
I don't think I was clear enough on what I meant. I have been using version 
1.1.3, 1.1.4, 1.1.5, 1.1.6 and have never seen this problem before, with the 
exact same configuration. This configuration have been working for 18 months or 
so.
 
Wheter this is a SLES 10 SP1 problem or a problem with Freeradius 1.1.7, I 
don't know, but something definately seems to be going wrong.

As for Pre and Post-auth they are both empty (IE. everything is commented out 
and does nothing). Thus there is no SQL or any other form of accounting being 
done. I am only authenticating.
 
I have a redundant server (1.1.6 version) thats been running for 4 and half 
months now (again, exact same configuration, same eDir servers etc.) so that 
leaves me to believe that this is indeed a problem with Freeradius 1.1.7 or 
SLES 10 SP1 or the combination of the 2.
 
- Nicolai

 [EMAIL PROTECTED] 29-10-2007 10:30 
Hi,

 I'm having the exact same problem as described below, with Freeradius
 1.7 hanging at 99 percent. Also using PEAP, MSCHAPV2, and eDir, and
 running 1.7 on a SLES 10 SP1.
 I have been using the same configuration since 1.3 without any problems
 problems, but since upgrading from 1.6 to 1.7 it's crashed 3 times
 within a month.

these errors are solely because the RADIUS doesnt get a response
from your eDirectory fast enough. please also note there is no
1.6 or 1.7 release.  its 1.1.6 and 1.1.7  (theres a vast difference
when people start throwing random numbers around).  if things
workes with 1.1.3 you need to check your config and logs to
find out why 1.1.6/1.1.7 arent as spritely. how often are
you talking to eDirectory? are you really needing to check pre-auth,
auth and post-auth for example?  are you relying on some MySQL
or postgres accounting?  It may well be that the eDirectory is
100% fine but the hold-up is with accounting packet handling.
or, more obviously, more people are using your service and you dont
have enough RADIUS threads running to handle the requests as you only
have eg 10 threads and they are already busy talking to eDirectory
as fast as they can


alan
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html 

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Server instability

2007-09-24 Thread Nathan Hay
I am a newbie, running 3 (for redundancy) FreeRadius servers (1.1.7) on
SUSE 10 SP1 (32-bit) to authenticate our wireless clients (PEAP
MSCHAPv2) to our eDirectory via LDAP.  We average 800-900 simultaneous
wireless clients (need to support a potential 4K in the future).
 
The setup works well and authenticates users very quickly, but every
couple days, the radiusd process will either blow up and start consuming
99% of the CPU or die altogether.  More often it blows up.  We had
stability problems initially, even when the process was running, so I
took everything out of the config that we didn't need and that seemed to
help.
 
Can anyone comment on our configuration and tell me if I'm doing
something wrong?  This is my first FreeRadius deployment and I don't
consider myself a Linux guru, let alone claim to know much about
Radius.
 
Thanks in advance,
 
Nathan Hay
Network Engineer
Cedarville University
 
prefix = /usr/local
exec_prefix = ${prefix}
sysconfdir = ${prefix}/etc
localstatedir = ${prefix}/var
sbindir = ${exec_prefix}/sbin
logdir = ${localstatedir}/log/radius
raddbdir = ${sysconfdir}/raddb
radacctdir = ${logdir}/radacct
confdir = ${raddbdir}
run_dir = ${localstatedir}/run/radiusd
log_file = ${logdir}/radius.log
libdir = ${exec_prefix}/lib
pidfile = ${run_dir}/radiusd.pid
 
user = radiusd
group = radiusd
 
max_request_time = 30
delete_blocked_requests = no
cleanup_delay = 5
max_requests = 512000
 
bind_address = *
port = 0
 
hostname_lookups = no
allow_core_dumps = no
regular_expressions = yes
extended_expressions= yes
log_stripped_names = yes
log_auth = no
log_auth_badpass = no
log_auth_goodpass = no
 
usercollide = no
 
lower_user = no
lower_pass = no
 
nospace_user = no
nospace_pass = no
 
checkrad = ${sbindir}/checkrad
 
security {
max_attributes = 200
reject_delay = 1
status_server = no
}
 
proxy_requests  = no
$INCLUDE  ${confdir}/clients.conf
snmp= no
 
thread pool {
 
start_servers = 16
max_servers = 64
min_spare_servers = 8
max_spare_servers = 16
max_requests_per_server = 0
}
 
modules {
 
$INCLUDE ${confdir}/eap.conf
 
mschap {
authtype = MS-CHAP
use_mppe = yes
require_encryption = yes
require_strong = yes
}
 
ldap {
server = XXX
identity = cn=XXX,o=XXX
password = XXX
basedn = o=XXX
filter = (cn=%{Stripped-User-Name:-%{User-Name}})
base_filter = (objectclass=radiusprofile)
start_tls = yes
tls_cacertfile  = /usr/local/etc/raddb/certs/ldap.cer
tls_cacertdir   = /usr/local/etc/raddb/certs/
tls_require_cert= demand
dictionary_mapping = ${raddbdir}/ldap.attrmap
ldap_connections_number = 10
password_attribute = nspmPassword
edir_account_policy_check=no
timeout = 4
timelimit = 3
net_timeout = 1
}
}
authorize {
mschap
eap
ldap
}
authenticate {
Auth-Type MS-CHAP {
mschap
}
Auth-Type LDAP {
ldap
}
eap
}
post-auth {
ldap
Post-Auth-Type REJECT {
ldap
}
}
 
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Server instability

2007-09-24 Thread Phil Mayers
On Mon, 2007-09-24 at 15:39 -0400, Nathan Hay wrote:
 I am a newbie, running 3 (for redundancy) FreeRadius servers (1.1.7)
 on SUSE 10 SP1 (32-bit) to authenticate our wireless clients (PEAP
 MSCHAPv2) to our eDirectory via LDAP.  We average 800-900 simultaneous
 wireless clients (need to support a potential 4K in the future).
  
 The setup works well and authenticates users very quickly, but every
 couple days, the radiusd process will either blow up and start
 consuming 99% of the CPU or die altogether.  More often it blows up.
 We had stability problems initially, even when the process was
 running, so I took everything out of the config that we didn't need
 and that seemed to help.

First question; are you HUPing the daemon? If so, don't - it won't work
well.

Second question; if this happens reliably can you recompile from
scratch:

./configure --enable-developer
make
make install

...and when it happens do this:

gdb /usr/local/sbin/radiusd
set pagination off
set logging file /root/radiusd-wireless.txt
set logging on
attach $PID
thread apply all bt full

...that'll give some details as to what the server is doing when it
pegs the CPU. Other options are strace or (if your Linux system has it)
SystemTap. The aim being to determine what it's doing when it goes wrong.

 

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Server instability

2007-09-24 Thread Matt Ashfield
What kind of error messages are you getting in your log when it blows up?

Quoting Phil Mayers [EMAIL PROTECTED]:

 On Mon, 2007-09-24 at 15:39 -0400, Nathan Hay wrote:
  I am a newbie, running 3 (for redundancy) FreeRadius servers (1.1.7)
  on SUSE 10 SP1 (32-bit) to authenticate our wireless clients (PEAP
  MSCHAPv2) to our eDirectory via LDAP.  We average 800-900 simultaneous
  wireless clients (need to support a potential 4K in the future).
   
  The setup works well and authenticates users very quickly, but every
  couple days, the radiusd process will either blow up and start
  consuming 99% of the CPU or die altogether.  More often it blows up.
  We had stability problems initially, even when the process was
  running, so I took everything out of the config that we didn't need
  and that seemed to help.
 
 First question; are you HUPing the daemon? If so, don't - it won't work
 well.
 
 Second question; if this happens reliably can you recompile from
 scratch:
 
 ./configure --enable-developer
 make
 make install
 
 ...and when it happens do this:
 
 gdb /usr/local/sbin/radiusd
 set pagination off
 set logging file /root/radiusd-wireless.txt
 set logging on
 attach $PID
 thread apply all bt full
 
 ...that'll give some details as to what the server is doing when it
 pegs the CPU. Other options are strace or (if your Linux system has it)
 SystemTap. The aim being to determine what it's doing when it goes wrong.
 
  
 
 -
 List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
 



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Server instability

2007-09-24 Thread Phil Mayers
On Mon, 2007-09-24 at 20:46 -0300, Matt Ashfield wrote:
 What kind of error messages are you getting in your log when it blows up?

Since you put me in the To: field as well as the list (please don't do
that, it breaks peoples list filtering and is annoying) it's not clear
to me who you're talking to - the original poster or me. If the latter
then you are confused; I'm not having problems.



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html