Re: HUP stops radiusd

2007-05-16 Thread Alan Dekok
John Horne wrote:
 I made the change and started Freeradius (from /etc/init.d). I could
 repeatedly HUP the daemon, and it answ would stay running according to 'ps'.
 However, the log file only showed one line for the first HUP and nothing
 at all after that. The line it showed was:
 
 Tue May 15 16:48:01 2007 : Info: Reloading configuration files.
 
 If I made a change to the 'users' file, HUP'd the daemon and then tried
 to run radtest, the daemon died. Nothing new in the log file. If I ran
 strace on the daemon it showed:

  That's encouraging.  We already know that HUP is fairly broken.  So
strange behavior on HUP is... expected.

 If I start radiusd using '/usr/sbin/radiusd -X', then I can see each HUP
 causes the config files to be read. Radiusd shows at each HUP:

  i.e. it works.

  So it's an OpenSSL bug.  Calling SSL_library_init() does NOT clear any
errors on OpenSSL's error stack.  That's bad.

 However, trying to then use radtest causes a segfault. No core file,
 although I have set 'allow_core_dumps'. Running an strace of the
 'radiusd -X' process shows:

  That's bad.

  I'll run it under valgrind to see if it gives any more information.

  But it looks like the clear error call fixes part of the problem.
The next step is to get HUP handling to work without crashing the server.

  Alan DeKok.
--
  http://deployingradius.com   - The web site of the book
  http://deployingradius.com/blog/ - The blog
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: HUP stops radiusd

2007-05-15 Thread John Horne
On Mon, 2007-05-14 at 22:56 +0200, Alan DeKok wrote:
 John Horne wrote:
 ...
  Mon May 14 13:38:54 2007 : Info: rlm_eap_tls: Loading the certificate
  file as a chain
  Mon May 14 13:38:54 2007 : Error: rlm_eap: SSL error error:0906D06C:PEM
  routines:PEM_read_bio:no start line
 
   Ah I think what's happening is that OpenSSL is caching the file
 from the last time it was read.  So the server starts, and reads 1
 certificate from the file.  OpenSSL leaves the file open, or remembers
 where it left off.  When FreeRADIUS asks OpenSSL to read the file again,
 OpenSSL continues from where it left off, rather than starting from the
 beginning of the file.
 
Well I like the explanation, but unfortunately it doesn't work. Radiusd
still dies at the first HUP.

However, one thing I have noticed is that if I start Freeradius up
from /etc/init.d (this is a CentOS server so I used 'service radiusd
start'), then I can HUP the daemon once and it stays running. HUP it a
second time and it fails (this is with one certificate in the file). If
I start Freeradius as '/usr/sbin/radiusd -X', and HUP it, then it fails
straight away. In both cases the failure messages are the same as those
originally reported.



John.

-- 
---
John Horne, University of Plymouth, UK  Tel: +44 (0)1752 233914
E-mail: [EMAIL PROTECTED]   Fax: +44 (0)1752 233839
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


HUP stops radiusd

2007-05-14 Thread John Horne
Hello,

This is a 'me too' message I'm afraid. From the list archives I saw:

==
 Date: Mon, 02 Apr 2007 20:20:47 +0200
 From: Alan DeKok aland at deployingradius.com
 Subject: Re: HUP in freeradius-1.1.5 + CVS results in process death.
 To: FreeRadius users mailing list
   freeradius-users at lists.freeradius.org
 Message-ID: 4611497F.3000500 at deployingradius.com
 Content-Type: text/plain; charset=ISO-8859-1

 Arran Cudbard-Bell wrote:
   
 I know theres a bug report for this already,
 but when I HUP the process freeradius doesn't die in the same place.
 

   If it's an issue due to incorrectly free'd memory, the crashes will be
 random.

   There may be a fix in 1.1.6, but I'm not sure.

   Alan DeKok.
 --
   http://deployingradius.com   - The web site of the book
   http://deployingradius.com/blog/ - The blog
   
Looks more like a bug in rlm_tls . Dies every time on HUP, deffinatly 
not random...
==


In our case, using freeradius 1.1.6, if I HUP the radiusd process it
crashes/stops. Running 'radiusd -X', the tail part shows:

=
 security: status_server = no
 main: debug_level = 0
read_config_files:  reading dictionary
read_config_files:  reading naslist
Using deprecated naslist file.  Support for this will go away soon.
read_config_files:  reading clients
read_config_files:  reading realms
Mon May 14 13:38:54 2007 : Info: rlm_exec: Wait=yes but no output
defined. Did you mean output=none?
Mon May 14 13:38:54 2007 : Error: radiusd.conf[230] Auth-Type PAP
already configured - skipping
Mon May 14 13:38:54 2007 : Error: radiusd.conf[234] Auth-Type MS-CHAP
already configured - skipping
Mon May 14 13:38:54 2007 : Info: rlm_eap_tls: Loading the certificate
file as a chain
Mon May 14 13:38:54 2007 : Error: rlm_eap: SSL error error:0906D06C:PEM
routines:PEM_read_bio:no start line
Mon May 14 13:38:54 2007 : Error: rlm_eap_tls: Error reading certificate
file
Mon May 14 13:38:54 2007 : Error: rlm_eap: Failed to initialize type tls
Mon May 14 13:38:54 2007 : Error: radiusd.conf[1]: eap: Module
instantiation failed.
Mon May 14 13:38:54 2007 : Error: radiusd.conf[238] Unknown module
eap.
Mon May 14 13:38:54 2007 : Error: radiusd.conf[229] Failed to parse
authenticate section.
=

This was running radiusd as the root user, running it as our usual
non-root user caused the same output. Starting up radiusd normally shows
no such error messages, so I'm not sure why it should now complain about
the Auth-Type's or the certificate. Using the original radiusd.conf
produces the same error messages, with a couple of extras (for the
Auth-Types's system and CHAP).

Any ideas?


Thanks,

John.

-- 
---
John Horne, University of Plymouth, UK  Tel: +44 (0)1752 233914
E-mail: [EMAIL PROTECTED]   Fax: +44 (0)1752 233839
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: HUP stops radiusd

2007-05-14 Thread John Horne
On Mon, 2007-05-14 at 15:22 +0200, inverse wrote:
  In our case, using freeradius 1.1.6, if I HUP the radiusd process it
  crashes/stops. Running 'radiusd -X', the tail part shows:
 
  Mon May 14 13:38:54 2007 : Error: rlm_eap_tls: Error reading certificate
  file
 
 on HUP the radiusd process probably tries to switch to a non-root
 user. That might the source of your message.
 
No, I deliberately configured the server to run as root to check that.
The same errors occurred. Secondly, if that was the case then the server
shouldn't really start at all, but it does. The problem only occurs (as
far as I am aware) after a HUP.


John.

-- 
---
John Horne, University of Plymouth, UK  Tel: +44 (0)1752 233914
E-mail: [EMAIL PROTECTED]   Fax: +44 (0)1752 233839
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: HUP stops radiusd

2007-05-14 Thread Alan DeKok
John Horne wrote:
...
 In our case, using freeradius 1.1.6, if I HUP the radiusd process it
 crashes/stops. Running 'radiusd -X', the tail part shows:
...
 Mon May 14 13:38:54 2007 : Error: radiusd.conf[230] Auth-Type PAP
 already configured - skipping
 Mon May 14 13:38:54 2007 : Error: radiusd.conf[234] Auth-Type MS-CHAP
 already configured - skipping

  Those errors can be suppressed.  It's probably worth doing.

 Mon May 14 13:38:54 2007 : Info: rlm_eap_tls: Loading the certificate
 file as a chain
 Mon May 14 13:38:54 2007 : Error: rlm_eap: SSL error error:0906D06C:PEM
 routines:PEM_read_bio:no start line

  Ah I think what's happening is that OpenSSL is caching the file
from the last time it was read.  So the server starts, and reads 1
certificate from the file.  OpenSSL leaves the file open, or remembers
where it left off.  When FreeRADIUS asks OpenSSL to read the file again,
OpenSSL continues from where it left off, rather than starting from the
beginning of the file.

  That's not nice.  And it's not documented as doing that.  But I
suspect it would work.

  A simple test would be to do the following:

1) put two copies of the certificate into the file, one after the other.
2) start the server and verify it works
3) HUP the server, and verify that it correctly loads the certificate
4) HUP the server again, and see that it complains.

  If you do that, and it works like I outlined, then I would argue that
it's a bug in OpenSSL.  FreeRADIUS calls SSL_new() to initialize
OpenSSL, and SSL_free() to clean up after itself.  But I'd bet that
OpenSSL does *not* return to it's initial state after calling
SSL_free().  Instead, it keeps some things cached...

  Knowing the explanation is nice, but I'm not sure how this lets us fix
the problem.

  Alan DeKok.
--
  http://deployingradius.com   - The web site of the book
  http://deployingradius.com/blog/ - The blog
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html