Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Craig Campbell
0x1d988920
   tailfrom = (VALUE_PAIR *) 0x0
   found = (VALUE_PAIR *) 0x2
#6  0x2b5c632b993f in file_common (inst=0x1d90bb20, request=0x1d9732f0, 
filename=0x2b5c632b9e8e "acct_users", ht=0x1d90bdb0,

   request_pairs=0x1d985fe0, reply_pairs=0x1d9734a8) at rlm_files.c:472
   pl = (const PAIR_LIST *) 0x1d90bd70
   name = 0x1d986158 "valboulev...@advanced"
   match = 0x2b5c632b9e5e "DEFAULT"
   config_pairs = (VALUE_PAIR **) 0x1d973318
   check_tmp = (VALUE_PAIR *) 0x1d9886a0
   reply_tmp = (VALUE_PAIR *) 0x1d9887e0
   user_pl = (const PAIR_LIST *) 0x0
   default_pl = (const PAIR_LIST *) 0x0
   found = 1
   my_pl = {name = 0x2b5c632b9e5e "DEFAULT", check = 0x0, reply = 0xe0, 
lineno = 242428, order = 0, next = 0x0,

 lastdefault = 0x4433e120}
   buffer = '\0' , "valboulevard\000advanced", '\0' 

#7  0x2b5c632b9a66 in file_preacct (instance=0x1d90bb20, 
request=0x1d9732f0) at rlm_files.c:525

   inst = (struct file_instance *) 0x1d90bb20
#8  0x00420443 in call_modsingle (component=2, sp=0x1d94c8d0, 
request=0x1d9732f0) at modcall.c:297

   myresult = 0
---Type  to continue, or q  to quit---
#9  0x0042126b in modcall (component=2, c=0x1d94bcf0, 
request=0x1d9732f0) at modcall.c:669

   myresult = 7
   stack = {pointer = 1, priority = {0, 2, 0 , 
4, -805247808, 58, 0, 0, 0, 0, 494780912, 0, 1144252848, 0,
   0, 0, 0, 0, 1144252848}, result = {7, 2, 0, 1144252688, 0, 1619086899, 
11100, 1144252856, 0, 494819584, 0, 1144252848, 0,
   1619068620, -1672236017, 494849120, 0, 494773456, 0, 494819584, 0, 
1144252848, 0, 1144252752, 0, 1619088694, 11100,
   1144252784, 0, 494819584, 0, 1144252848}, children = {0x1d94bcf0, 
0x1d94c8d0, 0x9c53b40f0a39, 0x4433e990, 0x2b5c608155e5,
   0x4433e9a0, 0x1d7dc1f0, 0x4433e9b0, 0x1d7da4d0, 0x4433ea00, 0x1d7e5900, 
0x4433e9d0, 0xfe, 0x1d7dc1f8, 0x1d988458,
   0x50031, 0x1d988468, 0x1d7dc208, 0xee, 0x4433ea60, 0x2b5c6081e050, 
0xa0, 0x2b5c60823dd9, 0x1, 0x1d988420,
   0x1d9732df, 0x4, 0x31, 0x1d8e6180, 0x0, 0x1d973110, 0x1d988420}, 
start = {0x310005, 0x1d94bc60, 0x1d7dc1f0,
   0x1d7dbff8, 0x300ac31e, 0x14, 0x1d9731a0, 0x1d9731b4, 
0x1d9731b4, 0x4433eac0, 0x2b5c6082009a, 0x14, 0x1d9731a0, 0x0,
   0x300ac31e, 0x1d97314c, 0x1d988420, 0x4433ec30, 0x2b5c6081f143, 
0x4433eb70, 0x2b5c6081cce1, 0x1d8e6180, 0x0, 0x11d8e6180,

   0x10001, 0x1, 0x1d8e6180, 0x0, 0x1d973110, 0x0, 0x0, 0x0}}
   parent = (modcallable *) 0x1d94bcf0
   child = (modcallable *) 0x1d94c8d0
   sp = (modsingle *) 0x1d94c8d0
   if_taken = 0
   was_if = 0
#10 0x0041ea4f in indexed_modcall (comp=2, idx=0, 
request=0x1d9732f0) at modules.c:691

   rcode = 0
   list = (modcallable *) 0x1d94bcf0
   server = (virtual_server_t *) 0x1d94a550
#11 0x0041fdb6 in module_preacct (request=0x1d9732f0) at 
modules.c:1470

No locals.
#12 0x0040813c in rad_accounting (request=0x1d9732f0) at acct.c:57
   vp = (VALUE_PAIR *) 0x1d9732f0
   acct_type = 0
   result = 2
#13 0x004356b5 in radius_handle_request (request=0x1d9732f0, 
fun=0x408108 ) at event.c:4086

No locals.
#14 0x00426bd6 in request_handler_thread (arg=0x1d966a50) at 
threads.c:492

   fun = (RAD_REQUEST_FUNP) 0x408108 
   self = (THREAD_HANDLE *) 0x1d966a50
#15 0x003ad0006367 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#16 0x003acf4d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 4 (Thread 0x4393e940 (LWP 23504)):
#0  0x003acf4dee6e in __lll_lock_wait_private () from /lib64/libc.so.6
No symbol table info available.
#1  0x003acf476668 in _L_lock_12629 () from /lib64/libc.so.6
No symbol table info available.
#2  0x003acf47477f in malloc_atfork () from /lib64/libc.so.6
No symbol table info available.
#3  0x003ad2cdab52 in CRYPTO_malloc () from /lib64/libcrypto.so.6
No symbol table info available.
#4  0x003ad2c7fdbc in ?? () from /lib64/libcrypto.so.6
No symbol table info available.
---Type  to continue, or q  to quit---
#5  0x003ad2cd82bd in ERR_clear_error () from /lib64/libcrypto.so.6
No symbol table info available.
#6  0x00426b4f in request_handler_thread (arg=0x1d9668d0) at 
threads.c:474

   fun = (RAD_REQUEST_FUNP) 0
   self = (THREAD_HANDLE *) 0x1d9668d0
#7  0x003ad0006367 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#8  0x003acf4d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 3 (Thread 0x42f3d940 (LWP 23503)):
#0  0x003acf4dee6e in __lll_lock_wait_private () from /lib64/libc.so.6
No symbol table info available.
#1  0x003acf476668 in _L_lock_12629 () from /lib64/libc.so.6
No symbol table info available.
#2  0x003acf47477f in malloc_atfork () from /lib64/libc.so.6
No symbol table info available.
#3  0x003ad2cdab52 in CRYPTO_malloc () from /lib64/libcrypto.so.6
No symbol table info available.
#4  0x003ad2c7fdbc in ?? () from /lib64/libcrypto.so.6
No symbol table info available.
#5  0x003ad2cd82bd in ERR_clear_error () from /lib64/libcrypto.so.6
No symbol table info available.
#6  0x00426b4f in request_handler_thread (arg=0x1d966750) at 
threads.c:474

   fun = (RAD_REQUEST_FUNP) 0



- Original Message - 
From: "Alan DeKok" 

To: "FreeRadius users mailing list" 
Sent: Thursday, November 26, 2009 2:19 PM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Bjørn Mork wrote:

However, I think I found one other possibility.  This code in
fr_event_loop() will exit if the select() fails:

rcode = select(el->maxfd + 1, &read_fds, NULL, NULL, 
wake);

if ((rcode < 0) && (errno != EINTR)) {
el->dispatch = 0;
return 0;
}


Might this happen due to a dead home server fd in the &read_fds?


 It shouldn't.  The only fail in select() is that one of the file
descriptors has been closed, without updating the read_fds array.  And
that shouldn't happen, either.

 What error is select() returning?

 Alan DeKok.
-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html 



__ Information from ESET Smart Security, version of virus signature 
database 4640 (20091126) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Alan DeKok
Bjørn Mork wrote:
> However, I think I found one other possibility.  This code in
> fr_event_loop() will exit if the select() fails:
> 
> rcode = select(el->maxfd + 1, &read_fds, NULL, NULL, wake);
> if ((rcode < 0) && (errno != EINTR)) {
> el->dispatch = 0;
> return 0;
> }
> 
> 
> Might this happen due to a dead home server fd in the &read_fds?

  It shouldn't.  The only fail in select() is that one of the file
descriptors has been closed, without updating the read_fds array.  And
that shouldn't happen, either.

  What error is select() returning?

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Bjørn Mork
Alan DeKok  writes:

> Bjørn Mork wrote:
>> Yes.  Just to be sure, I've repeated the process and the trace is the
>> same:  Useless:
> ...
>> Which I guess tells us that there is some other path here than through
>> fr_event_loop_exit and radius_signal_self with flag==2?
>
>   For the life of me, I can't see another path through the code.
>
>   Are you sending it period HUPs?

No

>   The only other possibility is some memory over-write.  You should use
> gdb...
>
> $ gdb --args radiusd -f
> (gdb) break fr_event_loop
> (gdb) run
> (gdb) watch el->exit
> (gdb) del 1
> (gdb) cont
> ...
> (gdb) bt
>
>   That should cause it to stop as soon as *anything* stomps on the event
> loop flag that says "stop the event loop"

Will do.

However, I think I found one other possibility.  This code in
fr_event_loop() will exit if the select() fails:

rcode = select(el->maxfd + 1, &read_fds, NULL, NULL, wake);
if ((rcode < 0) && (errno != EINTR)) {
el->dispatch = 0;
return 0;
}


Might this happen due to a dead home server fd in the &read_fds?



Bjørn

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Alan DeKok
Bjørn Mork wrote:
> Yes.  Just to be sure, I've repeated the process and the trace is the
> same:  Useless:
...
> Which I guess tells us that there is some other path here than through
> fr_event_loop_exit and radius_signal_self with flag==2?

  For the life of me, I can't see another path through the code.

  Are you sending it period HUPs?

  The only other possibility is some memory over-write.  You should use
gdb...

$ gdb --args radiusd -f
(gdb) break fr_event_loop
(gdb) run
(gdb) watch el->exit
(gdb) del 1
(gdb) cont
...
(gdb) bt

  That should cause it to stop as soon as *anything* stomps on the event
loop flag that says "stop the event loop"

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Bjørn Mork
Alan DeKok  writes:
> Bjørn Mork wrote:
>> I don't have that symbol.  Did you mean fr_event_loop_exit?
>
>   Sure.
>
>> Anyway, I ran the server (this time a lab-/test-server with some traffic
>> but nothing near any real load) using
>> 
>> Breakpoint 1, radius_signal_self (flag=2) at event.c:3733
>> 3733event.c: No such file or directory.
>> in event.c
>> (gdb) cont
>
>   Arg... PLEASE give the stack trace for this!  "bt", or "thread apply
> all bt full".
>
>   Simply continuing means that you've ignored the break point.

Sorry, that was a mere cut'n paste error from an initial test (no the
actual bug, just tested that I'd actually break on SIGTERM).

I apologise for the confusion this caused.

>> And I got:
>
>   The same stack trace as before, LONG after the useful information has
> been lost.

Yes.  Just to be sure, I've repeated the process and the trace is the
same:  Useless:

(gdb) bt full
#0  0x00390d8306f7 in kill () from /lib64/libc.so.6
No symbol table info available.
#1  0x00423b6a in main (argc=4, argv=0x7fff3ada9b18) at radiusd.c:419
rcode = 0
argval = -1
spawn_flag = 1
dont_fork = 1
flag = 0
act = {__sigaction_handler = {sa_handler = 0x423d41 , 
sa_sigaction = 0x423d41 }, sa_mask = {__val = {
  0 }}, sa_flags = 0, sa_restorer = 0}


Which I guess tells us that there is some other path here than through
fr_event_loop_exit and radius_signal_self with flag==2?

What that could possibly be is beyond my imagination, I must admit...



Bjørn

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Alan DeKok
Craig Campbell wrote:
> With the info you need (I hope)...

> Breakpoint 2, radius_signal_self (flag=8) at event.c:3733
> 3733rcode = read(self_pipe[0], buffer, sizeof(buffer));

  Hmm... that's with 'flag == 8', not 'flag == 2'.  Why is it stopping?

  It's *supposed* to call that function a lot when the detail file is
being read.  But the condition when "flag ==2" is the ONLY interesting one.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Craig Campbell
0426abb in request_handler_thread (arg=0x11533a50) at 
threads.c:453

   fun = (RAD_REQUEST_FUNP) 0x408108 
   self = (THREAD_HANDLE *) 0x11533a50
#2  0x003ad0006367 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x003acf4d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 4 (Thread 0x4246f940 (LWP 5880)):
#0  0x003ad000c6b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00426abb in request_handler_thread (arg=0x115338d0) at 
threads.c:453

   fun = (RAD_REQUEST_FUNP) 0x408108 
   self = (THREAD_HANDLE *) 0x115338d0
#2  0x003ad0006367 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x003acf4d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 3 (Thread 0x41a6e940 (LWP 5879)):
#0  0x003ad000c6b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00426abb in request_handler_thread (arg=0x11533750) at 
threads.c:453

   fun = (RAD_REQUEST_FUNP) 0x408108 
   self = (THREAD_HANDLE *) 0x11533750
#2  0x003ad0006367 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x003acf4d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 2 (Thread 0x4106d940 (LWP 5878)):
#0  0x003ad000c6b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00426abb in request_handler_thread (arg=0x11532620) at 
threads.c:453

   fun = (RAD_REQUEST_FUNP) 0x408108 
   self = (THREAD_HANDLE *) 0x11532620
#2  0x003ad0006367 in start_thread () from /lib64/libpthread.so.0
---Type  to continue, or q  to quit---
No symbol table info available.
#3  0x003acf4d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 1 (Thread 0x2b9e812b4e10 (LWP 5870)):
#0  0x003acf4cc5f2 in select () from /lib64/libc.so.6
No symbol table info available.
#1  0x2b9e80e7fa4c in fr_event_loop (el=0x1151b7f0) at event.c:379
   i = 5
   rcode = 0
   when = {tv_sec = 0, tv_usec = 617790}
   wake = (struct timeval *) 0x7fff29c4b000
   read_fds = {fds_bits = {122944, 0 }}
#2  0x004355b3 in radius_event_process () at event.c:4072
No locals.
#3  0x00424124 in main (argc=2, argv=0x7fff29c4b248) at 
radiusd.c:398

   rcode = 58
   argval = -1
   spawn_flag = 1
   dont_fork = 1
   flag = 0
   act = {__sigaction_handler = {sa_handler = 0x424349 , 
sa_sigaction = 0x424349 }, sa_mask = {

   __val = {0 }}, sa_flags = 0, sa_restorer = 0}
3733rcode = read(self_pipe[0], buffer, sizeof(buffer));
(gdb)

I haven't quit gdb yet, so let me know if you need more...

Thanks,
-craig

- Original Message - 
From: "Alan DeKok" 

To: "FreeRadius users mailing list" 
Sent: Thursday, November 26, 2009 7:36 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Bjørn Mork wrote:

I don't have that symbol.  Did you mean fr_event_loop_exit?


 Sure.


Anyway, I ran the server (this time a lab-/test-server with some traffic
but nothing near any real load) using

Breakpoint 1, radius_signal_self (flag=2) at event.c:3733
3733event.c: No such file or directory.
in event.c
(gdb) cont


 Arg... PLEASE give the stack trace for this!  "bt", or "thread apply
all bt full".

 Simply continuing means that you've ignored the break point.


And I got:


 The same stack trace as before, LONG after the useful information has
been lost.

 Alan DeKok.

-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html 



__ Information from ESET Smart Security, version of virus signature 
database 4636 (20091125) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Craig Campbell

Here are the results from the latest gdb,

[[r...@radius-a ~]# gdb radiusd
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
(gdb) break event_loop_exit
Function "event_loop_exit" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (event_loop_exit) pending.
(gdb) break radius_signal_self
Breakpoint 2 at 0x434d9f: file event.c, line 3733.
(gdb) cond 1 (flag == 2)
(gdb) run -f
Starting program: /usr/local/sbin/radiusd -f
Error in re-setting breakpoint 1: Function "event_loop_exit" not defined.
[Thread debugging using libthread_db enabled]
[New Thread 0x2b8a5990de10 (LWP 543)]
[New Thread 0x41850940 (LWP 551)]
[New Thread 0x42400940 (LWP 552)]
[New Thread 0x42e01940 (LWP 553)]
[New Thread 0x43802940 (LWP 554)]
[New Thread 0x44203940 (LWP 555)]
Detaching after fork from child process 556.
Detaching after fork from child process 557.
Detaching after fork from child process 616.



Detaching after fork from child process 5364.
Detaching after fork from child process 5394.
[Switching to Thread 0x45605940 (LWP 4185)]

Breakpoint 2, radius_signal_self (flag=8) at event.c:3733
3733rcode = read(self_pipe[0], buffer, sizeof(buffer));
(gdb)

Thanks,
-craig

- Original Message - 
From: "Alan DeKok" 

To: "FreeRadius users mailing list" 
Sent: Thursday, November 26, 2009 1:45 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Bjørn Mork wrote:

I am now seeing this very same problem, and strongly suspect it to be
related to dead proxy home servers.  I was able to provoke the "Exiting
normally" on a server with *no* traffic at all, by doing a couple of
requests for a realm with dead home servers and then waiting:

 Wed Nov 25 18:03:56 2009 : Error: PROXY: Marking home server 88.a.b.158 
port 1812 as zombie (it looks like it is dead).
 Wed Nov 25 18:04:35 2009 : Error: PROXY: Marking home server 84.c.d.222 
port 1812 as zombie (it looks like it is dead).

 Wed Nov 25 19:38:13 2009 : Info: Exiting normally.

No requests at all were sent to this server between the two last log
lines.


 Hmm... the "exiting normally" means that it received a signal to exit
(internal or external).  Otherwise, it just keeps running.

 Try using gdb, and:

(gdb) break event_loop_exit
(gdb) break radius_signal_self
(gdb) cond 1 (flag == 2)

(gdb) run

 And then when it stops:

(gdb) thread apply all bt full

 That *should* catch the stack trace where it exits.


I was planning to use the 2.1.7 release, but hit the recursive mutex
problem.


 Ugh.  Some systems don't support recursive mutexes, and even better,
don't complain when you try to use them!


 Now, adding the two facts, I'm starting to wonder whether the
"Exiting normally" bug might be related to the fix for the recursive
mutexes?  They are both related to dead home servers.  Makes me
suspicious...


 Quite possibly, yes.  But the fact that it exits a minute and a half
after the last packet is odd.


And I'm wondering what my other options are wrt the mutex problem.  I am
pretty much stuch with RHEL on these servers (not my choice).  Is this a
glibc 2.5 problem?  Should I demand an upgrade to a more modern OS?


 Let's wait for the back trace.

 Alan DeKok.
-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html 



__ Information from ESET Smart Security, version of virus signature 
database 4636 (20091125) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Alan DeKok
Bjørn Mork wrote:
> I don't have that symbol.  Did you mean fr_event_loop_exit?

  Sure.

> Anyway, I ran the server (this time a lab-/test-server with some traffic
> but nothing near any real load) using
> 
> Breakpoint 1, radius_signal_self (flag=2) at event.c:3733
> 3733event.c: No such file or directory.
> in event.c
> (gdb) cont

  Arg... PLEASE give the stack trace for this!  "bt", or "thread apply
all bt full".

  Simply continuing means that you've ignored the break point.

> And I got:

  The same stack trace as before, LONG after the useful information has
been lost.

  Alan DeKok.

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-26 Thread Bjørn Mork
Alan DeKok  writes:
> Bjørn Mork wrote:
>> I am now seeing this very same problem, and strongly suspect it to be
>> related to dead proxy home servers.  I was able to provoke the "Exiting
>> normally" on a server with *no* traffic at all, by doing a couple of
>> requests for a realm with dead home servers and then waiting:
>> 
>>  Wed Nov 25 18:03:56 2009 : Error: PROXY: Marking home server 88.a.b.158 
>> port 1812 as zombie (it looks like it is dead).
>>  Wed Nov 25 18:04:35 2009 : Error: PROXY: Marking home server 84.c.d.222 
>> port 1812 as zombie (it looks like it is dead).
>>  Wed Nov 25 19:38:13 2009 : Info: Exiting normally.
>> 
>> No requests at all were sent to this server between the two last log
>> lines.
>
>   Hmm... the "exiting normally" means that it received a signal to exit
> (internal or external).  Otherwise, it just keeps running.
>
>   Try using gdb, and:
>
> (gdb) break event_loop_exit

I don't have that symbol.  Did you mean fr_event_loop_exit?
Anyway, I ran the server (this time a lab-/test-server with some traffic
but nothing near any real load) using

Breakpoint 1, radius_signal_self (flag=2) at event.c:3733
3733event.c: No such file or directory.
in event.c
(gdb) cont
Continuing.

Breakpoint 2, fr_event_loop_exit (el=0x8e3c4d0, code=2) at event.c:309
309 event.c: No such file or directory.
in event.c



This is still based on the stable tree with commit
2df19cf0024fd23d2906c13c0b01067076540871 as the last one.


And I got:



Program received signal SIGTERM, Terminated.
0x00390d8306f7 in kill () from /lib64/libc.so.6
(gdb)  thread apply all bt full

Thread 22 (Thread 0x4d189940 (LWP 23437)):
#0  0x00390e40c9b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x0042632e in request_handler_thread (arg=0x1035f830) at 
threads.c:453
fun = (RAD_REQUEST_FUNP) 0x4080d8 
self = (THREAD_HANDLE *) 0x1035f830
#2  0x00390e4064a7 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00390d8d3c2d in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 20 (Thread 0x4bd87940 (LWP 23435)):
#0  0x00390e40c9b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x0042632e in request_handler_thread (arg=0x1035f530) at 
threads.c:453
fun = (RAD_REQUEST_FUNP) 0x4094c7 
self = (THREAD_HANDLE *) 0x1035f530
#2  0x00390e4064a7 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00390d8d3c2d in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 17 (Thread 0x49f84940 (LWP 23432)):
#0  0x00390e40c9b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x0042632e in request_handler_thread (arg=0x1035f0b0) at 
threads.c:453
fun = (RAD_REQUEST_FUNP) 0x4094c7 
self = (THREAD_HANDLE *) 0x1035f0b0
#2  0x00390e4064a7 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00390d8d3c2d in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 14 (Thread 0x48181940 (LWP 23429)):
#0  0x00390e40c9b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x0042632e in request_handler_thread (arg=0x1035ec30) at 
threads.c:453
fun = (RAD_REQUEST_FUNP) 0x4080d8 

self = (THREAD_HANDLE *) 0x1035ec30
#2  0x00390e4064a7 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00390d8d3c2d in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 12 (Thread 0x46d7f940 (LWP 23427)):
#0  0x00390e40c9b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x0042632e in request_handler_thread (arg=0x1035e930) at 
threads.c:453
fun = (RAD_REQUEST_FUNP) 0x4080d8 
self = (THREAD_HANDLE *) 0x1035e930
#2  0x00390e4064a7 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00390d8d3c2d in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 10 (Thread 0x4597d940 (LWP 23425)):
#0  0x00390e40c9b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x0042632e in request_handler_thread (arg=0x1035e630) at 
threads.c:453
fun = (RAD_REQUEST_FUNP) 0x4094c7 
self = (THREAD_HANDLE *) 0x1035e630
#2  0x00390e4064a7 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00390d8d3c2d in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 3 (Thread 0x41d77940 (LWP 23418)):
#0  0x00390e40c9b1 in sem_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x0042632e in request_handler_thread (arg=0x1035dcf0) at 
threads.c:453
fun = (RAD_REQUEST_FUNP) 0x4094c7 
self = (THREAD_HANDLE *) 0x1035dcf0
#2  0x00390e4064a7 in start_

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-25 Thread Bjørn Mork
Alan DeKok  writes:
> Bjørn Mork wrote:
>> I am now seeing this very same problem, and strongly suspect it to be
>> related to dead proxy home servers.  I was able to provoke the "Exiting
>> normally" on a server with *no* traffic at all, by doing a couple of
>> requests for a realm with dead home servers and then waiting:
>> 
>>  Wed Nov 25 18:03:56 2009 : Error: PROXY: Marking home server 88.a.b.158 
>> port 1812 as zombie (it looks like it is dead).
>>  Wed Nov 25 18:04:35 2009 : Error: PROXY: Marking home server 84.c.d.222 
>> port 1812 as zombie (it looks like it is dead).
>>  Wed Nov 25 19:38:13 2009 : Info: Exiting normally.
>> 
>> No requests at all were sent to this server between the two last log
>> lines.
>
>   Hmm... the "exiting normally" means that it received a signal to exit
> (internal or external).  Otherwise, it just keeps running.
>
>   Try using gdb, and:
>
> (gdb) break event_loop_exit
> (gdb) break radius_signal_self
> (gdb) cond 1 (flag == 2)
>
> (gdb) run
>
>   And then when it stops:
>
> (gdb) thread apply all bt full
>
>   That *should* catch the stack trace where it exits.

Will do.  Thanks

>> I was planning to use the 2.1.7 release, but hit the recursive mutex
>> problem.
>
>   Ugh.  Some systems don't support recursive mutexes, and even better,
> don't complain when you try to use them!
>
>>  Now, adding the two facts, I'm starting to wonder whether the
>> "Exiting normally" bug might be related to the fix for the recursive
>> mutexes?  They are both related to dead home servers.  Makes me
>> suspicious...
>
>   Quite possibly, yes.  But the fact that it exits a minute and a half
> after the last packet is odd.

Note that it's an hour and a half.  Which I guess is even more odd.

This is todays events for the server which is in production:

 server ~ 1004$ grep Exit log/radius.log
 Thu Nov 26 02:08:20 2009 : Info: Exiting normally.
 Thu Nov 26 04:16:52 2009 : Info: Exiting normally.
 Thu Nov 26 05:52:20 2009 : Info: Exiting normally.
 Thu Nov 26 07:40:19 2009 : Info: Exiting normally.


Notice the pattern.  There's 1.5 ~ 2 hours between each restart.  


Bjørn

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-25 Thread Alan DeKok
Bjørn Mork wrote:
> I am now seeing this very same problem, and strongly suspect it to be
> related to dead proxy home servers.  I was able to provoke the "Exiting
> normally" on a server with *no* traffic at all, by doing a couple of
> requests for a realm with dead home servers and then waiting:
> 
>  Wed Nov 25 18:03:56 2009 : Error: PROXY: Marking home server 88.a.b.158 port 
> 1812 as zombie (it looks like it is dead).
>  Wed Nov 25 18:04:35 2009 : Error: PROXY: Marking home server 84.c.d.222 port 
> 1812 as zombie (it looks like it is dead).
>  Wed Nov 25 19:38:13 2009 : Info: Exiting normally.
> 
> No requests at all were sent to this server between the two last log
> lines.

  Hmm... the "exiting normally" means that it received a signal to exit
(internal or external).  Otherwise, it just keeps running.

  Try using gdb, and:

(gdb) break event_loop_exit
(gdb) break radius_signal_self
(gdb) cond 1 (flag == 2)

(gdb) run

  And then when it stops:

(gdb) thread apply all bt full

  That *should* catch the stack trace where it exits.

> I was planning to use the 2.1.7 release, but hit the recursive mutex
> problem.

  Ugh.  Some systems don't support recursive mutexes, and even better,
don't complain when you try to use them!

>  Now, adding the two facts, I'm starting to wonder whether the
> "Exiting normally" bug might be related to the fix for the recursive
> mutexes?  They are both related to dead home servers.  Makes me
> suspicious...

  Quite possibly, yes.  But the fact that it exits a minute and a half
after the last packet is odd.

> And I'm wondering what my other options are wrt the mutex problem.  I am
> pretty much stuch with RHEL on these servers (not my choice).  Is this a
> glibc 2.5 problem?  Should I demand an upgrade to a more modern OS?

  Let's wait for the back trace.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-25 Thread Bjørn Mork
I am now seeing this very same problem, and strongly suspect it to be
related to dead proxy home servers.  I was able to provoke the "Exiting
normally" on a server with *no* traffic at all, by doing a couple of
requests for a realm with dead home servers and then waiting:

 Wed Nov 25 18:03:56 2009 : Error: PROXY: Marking home server 88.a.b.158 port 
1812 as zombie (it looks like it is dead).
 Wed Nov 25 18:04:35 2009 : Error: PROXY: Marking home server 84.c.d.222 port 
1812 as zombie (it looks like it is dead).
 Wed Nov 25 19:38:13 2009 : Info: Exiting normally.

No requests at all were sent to this server between the two last log
lines.  This server is running the latest stable git, i.e. up to
 commit 2df19cf0024fd23d2906c13c0b01067076540871

I was planning to use the 2.1.7 release, but hit the recursive mutex
problem.  Now, adding the two facts, I'm starting to wonder whether the
"Exiting normally" bug might be related to the fix for the recursive
mutexes?  They are both related to dead home servers.  Makes me
suspicious...

And I'm wondering what my other options are wrt the mutex problem.  I am
pretty much stuch with RHEL on these servers (not my choice).  Is this a
glibc 2.5 problem?  Should I demand an upgrade to a more modern OS?


Bjørn

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-25 Thread Alan DeKok
Craig Campbell wrote:
> Ok,
>can anyone identify a certain "GOOD" build to use for the git bisect?
> (Say where 2.1.7 was released?)
> 
> I looked through the logs and have arbitrarily selected,
> 134f314c57d67b56bab93db4089c25e956ad6cf2] Lots of notes prior to 2.1.7
> 
> I do not know how to force git to build that revision so I could
> actually verify it is good.

  You could always try building it by hand.

  Also, when running it in gdb, try:

(gdb) break radius_signal_self
(gdb) cond 1 (flag == 2)

(gdb) run

  That should catch the case where it's been told to exit.  That should
be the *only* case where it exits the event loop.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-25 Thread Bjørn Mork
"Craig Campbell"  writes:

>can anyone identify a certain "GOOD" build to use for the git
> bisect? (Say where 2.1.7 was released?)
>
> I looked through the logs and have arbitrarily selected,
> 134f314c57d67b56bab93db4089c25e956ad6cf2] Lots of notes prior to 2.1.7
>
> I do not know how to force git to build that revision so I could
> actually verify it is good.

Not sure if I understand the question...

"git tag" will give you a list of tags.  "release_2_1_7" looks like a
good choice.  You could use "git log" or "git show" or something like
that to get the hash, but you really don't need to. If you know the
2.1.7 release was good, the you can just do

 "git bisect good release_2_1_7"




Bjørn

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-25 Thread Craig Campbell

Ok,
   can anyone identify a certain "GOOD" build to use for the git bisect? 
(Say where 2.1.7 was released?)


I looked through the logs and have arbitrarily selected,
134f314c57d67b56bab93db4089c25e956ad6cf2] Lots of notes prior to 2.1.7

I do not know how to force git to build that revision so I could actually 
verify it is good.


Thanks,
-craig
- Original Message - 
From: "Craig Campbell" 

To: "FreeRadius users mailing list" 
Sent: Tuesday, November 24, 2009 7:28 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Thanks for the correction.

I have rebuilt and am re-running my test.  I just hope I didn't somehow 
taint the bisect work and provide misleading information to Alan.


I should know some time today if I need to redo the bisection.
For my previous work I had done,

$git bisect start
$git bisect bad
$git bisect good 321c0ae58641f709d115526bb564cbd8c4dab71d<- I do 
not have full confidence in this


Followed by loops of ,
$./conf
$CFLAGS='-O0 -g' ./configure
$make clean
$find . -name "*.o"<- sometimes I found lingering .o 
files - not certain why.  I would delete any I discovered at this point

$make
$git bisect skip|bad|good<- depending on if build failed, binary 
crashed or other error (skip), had error (bad), or succeeded(good)
$git pull  <- I THINK this may be 
unnecessary..  but not certain.  Docs I found on git were not entirely 
clear


If I need to re-bisect, could you perhaps spoon feed me the commands to 
ensure I'm doing it correctly?  Specifically, how can I acquire and verify 
I have my first "good" build?  And then the incantation to perform 
iterative bisections until I run out.


I truly hope I haven't provided misleading info.

Thanks,
-craig
- Original Message - 
From: "Alexander Clouter" 

To: 
Sent: Monday, November 23, 2009 8:13 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Hi,

Craig Campbell  wrote:


   I re -acquired the source, but there seems to be a (minor I think) 
error.


   $git clone git://git.freeradius.org/freeradius-server.git
   $cd freeradius-server
   $git fetch origin stable:stable
   $git pull   <- should be 'git checkout stable'
   $make clean
   $CFLAGS='-O0 -g' ./configure
   $make


Otherwise if I am reading that right you are trying to compile off the
unstable branch.

Cheers

--
Alexander Clouter
.sigmonster says: BOFH excuse #169:
 broadcast packets on wrong frequency

-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html


__ Information from ESET Smart Security, version of virus 
signature database 4630 (20091123) __


The message was checked by ESET Smart Security.

http://www.eset.com






__ Information from ESET Smart Security, version of virus 
signature database 4632 (20091124) __


The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html


__ Information from ESET Smart Security, version of virus 
signature database 4632 (20091124) __


The message was checked by ESET Smart Security.

http://www.eset.com






__ Information from ESET Smart Security, version of virus signature 
database 4635 (20091125) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-24 Thread Craig Campbell

Thanks for the correction.

I have rebuilt and am re-running my test.  I just hope I didn't somehow 
taint the bisect work and provide misleading information to Alan.


I should know some time today if I need to redo the bisection.
For my previous work I had done,

$git bisect start
$git bisect bad
$git bisect good 321c0ae58641f709d115526bb564cbd8c4dab71d<- I do not 
have full confidence in this


Followed by loops of ,
$./conf
$CFLAGS='-O0 -g' ./configure
$make clean
$find . -name "*.o"<- sometimes I found lingering .o files - 
not certain why.  I would delete any I discovered at this point

$make
$git bisect skip|bad|good<- depending on if build failed, binary 
crashed or other error (skip), had error (bad), or succeeded(good)
$git pull  <- I THINK this may be 
unnecessary..  but not certain.  Docs I found on git were not entirely clear


If I need to re-bisect, could you perhaps spoon feed me the commands to 
ensure I'm doing it correctly?  Specifically, how can I acquire and verify I 
have my first "good" build?  And then the incantation to perform iterative 
bisections until I run out.


I truly hope I haven't provided misleading info.

Thanks,
-craig
- Original Message - 
From: "Alexander Clouter" 

To: 
Sent: Monday, November 23, 2009 8:13 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Hi,

Craig Campbell  wrote:


   I re -acquired the source, but there seems to be a (minor I think) 
error.


   $git clone git://git.freeradius.org/freeradius-server.git
   $cd freeradius-server
   $git fetch origin stable:stable
   $git pull   <- should be 'git checkout stable'
   $make clean
   $CFLAGS='-O0 -g' ./configure
   $make


Otherwise if I am reading that right you are trying to compile off the
unstable branch.

Cheers

--
Alexander Clouter
.sigmonster says: BOFH excuse #169:
 broadcast packets on wrong frequency

-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html


__ Information from ESET Smart Security, version of virus 
signature database 4630 (20091123) __


The message was checked by ESET Smart Security.

http://www.eset.com






__ Information from ESET Smart Security, version of virus signature 
database 4632 (20091124) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-23 Thread Craig Campbell

Hmm...  it seems the error remains...  (See below)

I will try another 'fresh build' tomorrow just in case I did something 
wrong.


Thanks,
-craig

Detaching after fork from child process 659.
Detaching after fork from child process 689.

Program received signal SIGTERM, Terminated.
0x003acf4306a7 in kill () from /lib64/libc.so.6
(gdb)
(gdb)
(gdb) bt full
#0  0x003acf4306a7 in kill () from /lib64/libc.so.6
No symbol table info available.
#1  0x00424172 in main (argc=2, argv=0x7fff6246da68) at 
radiusd.c:419

   rcode = 0
   argval = -1
   spawn_flag = 1
   dont_fork = 1
   flag = 0
   act = {__sigaction_handler = {sa_handler = 0x424349 , 
sa_sigaction = 0x424349 }, sa_mask = {

   __val = {0 }}, sa_flags = 0, sa_restorer = 0}
(gdb) where
#0  0x003acf4306a7 in kill () from /lib64/libc.so.6
#1  0x00424172 in main (argc=2, argv=0x7fff6246da68) at 
radiusd.c:419

(gdb)



- Original Message - 
From: "Alan DeKok" 

To: "FreeRadius users mailing list" 
Sent: Monday, November 23, 2009 7:06 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Craig Campbell wrote:

Thanks Alan,
I re -acquired the source, but there seems to be a (minor I think)
error.


  $git clone git://git.freeradius.org/freeradius-server.git
  $cd freeradius-server
  $git fetch origin stable:stable
  $git pull


 No.  See http://git.freeradius.org for the exact commands.

$ git checkout stable

 Alan DeKok.
-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html


__ Information from ESET Smart Security, version of virus 
signature database 4629 (20091123) __


The message was checked by ESET Smart Security.

http://www.eset.com






__ Information from ESET Smart Security, version of virus signature 
database 4631 (20091123) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-23 Thread Alexander Clouter
Hi,

Craig Campbell  wrote:
>
>I re -acquired the source, but there seems to be a (minor I think) error.
> 
>$git clone git://git.freeradius.org/freeradius-server.git
>$cd freeradius-server
>$git fetch origin stable:stable
>$git pull   <- should be 'git checkout stable'
>$make clean
>$CFLAGS='-O0 -g' ./configure 
>$make
>
Otherwise if I am reading that right you are trying to compile off the 
unstable branch.

Cheers

-- 
Alexander Clouter
.sigmonster says: BOFH excuse #169:
  broadcast packets on wrong frequency

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-23 Thread Alan DeKok
Craig Campbell wrote:
> Thanks Alan,
> I re -acquired the source, but there seems to be a (minor I think)
> error.
>  
> 
>   $git clone git://git.freeradius.org/freeradius-server.git
>   $cd freeradius-server
>   $git fetch origin stable:stable
>   $git pull

  No.  See http://git.freeradius.org for the exact commands.

$ git checkout stable

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-23 Thread Craig Campbell
Thanks Alan,
I re -acquired the source, but there seems to be a (minor I think) error.

$git clone git://git.freeradius.org/freeradius-server.git
$cd freeradius-server
$git fetch origin stable:stable
$git pull
$make clean
$CFLAGS='-O0 -g' ./configure 
$make

  Making all in frs_acct...
  gmake[6]: Entering directory 
`/home/craig/src/freeradius/freeradius-server/src/modules/frs_acct'
  /bin/sh /home/craig/src/freeradius/freeradius-server/libtool --mode=compile 
gcc  -O0 -g -D_REENTRANT -D_POSIX_PTHREAD_SEMANTICS -Wall -D_GNU_SOURCE -g 
-Wshadow -Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings 
-Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations 
-Wnested-externs -W -Wredundant-decls -Wundef 
-I/home/craig/src/freeradius/freeradius-server/src 
-I/home/craig/src/freeradius/freeradius-server/libltdl  -c frs_acct.c
  libtool: compile:  gcc -O0 -g -D_REENTRANT -D_POSIX_PTHREAD_SEMANTICS -Wall 
-D_GNU_SOURCE -g -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align 
-Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations 
-Wnested-externs -W -Wredundant-decls -Wundef 
-I/home/craig/src/freeradius/freeradius-server/src 
-I/home/craig/src/freeradius/freeradius-server/libltdl -c frs_acct.c  -fPIC 
-DPIC -o .libs/frs_acct.o
  In file included from 
/home/craig/src/freeradius/freeradius-server/src/freeradius-devel/radiusd.h:107,
   from frs_acct.c:29:
  
/home/craig/src/freeradius/freeradius-server/src/freeradius-devel/smodule.h:144:
 error: expected specifier-qualifier-list before 'RADCLIENT'
  gmake[6]: *** [frs_acct.lo] Error 1
  gmake[6]: Leaving directory 
`/home/craig/src/freeradius/freeradius-server/src/modules/frs_acct'
  gmake[5]: *** [common] Error 2
As soon as I can build a version, I'll test again to ensure we got the bug we 
were seeking.

Thanks,
-craig


- Original Message - 
From: "Alan DeKok" 
To: "FreeRadius users mailing list" 
Sent: Sunday, November 22, 2009 3:14 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?


> Craig Campbell wrote:
>> Once you have another version (reverted), I can test again...
>> 
>> I am really unfamiliar with git, so I may need a hint as to getting  the
>> correct version for testing.
> 
>  I've reverted the problem commit.  It doesn't fix the PostgreSQL
> issue, and it causes other problems.
> 
>  The fix is now in the "stable" branch.
> 
>  Alan DeKok.
> -
> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
> 
> __ Information from ESET Smart Security, version of virus signature 
> database 4627 (20091121) __
> 
> The message was checked by ESET Smart Security.
> 
> http://www.eset.com
> 
> 
>


__ Information from ESET Smart Security, version of virus signature 
database 4629 (20091123) __

The message was checked by ESET Smart Security.

http://www.eset.com

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-22 Thread Alan DeKok
Craig Campbell wrote:
> Once you have another version (reverted), I can test again...
> 
> I am really unfamiliar with git, so I may need a hint as to getting  the
> correct version for testing.

  I've reverted the problem commit.  It doesn't fix the PostgreSQL
issue, and it causes other problems.

  The fix is now in the "stable" branch.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-18 Thread Craig Campbell

Once you have another version (reverted), I can test again...

I am really unfamiliar with git, so I may need a hint as to getting  the 
correct version for testing.


Thanks,
-craig
- Original Message - 
From: "Alan DeKok" 

To: "FreeRadius users mailing list" 
Sent: Wednesday, November 18, 2009 12:31 PM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Craig Campbell wrote:

Ok,
   I hope this is helpful.  Below please find the git bisect log.
There were a number of iterations with make errors which I then
skipped.  I suspect the errors were OS specific and were clearly fixed
in later iterations.

-bash-3.2$ git bisect log
git bisect start
# bad: [9dbc8974fdd2300a70293eda9c62bce20a3c9165] errormsg may be NULL


 Huh...  Since that commit doesn't help the reported bug, it's likely
best to just revert it.  Oh well.

 Alan DeKok.
-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html


__ Information from ESET Smart Security, version of virus 
signature database 4618 (20091118) __


The message was checked by ESET Smart Security.

http://www.eset.com






__ Information from ESET Smart Security, version of virus signature 
database 4618 (20091118) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-18 Thread Alan DeKok
Craig Campbell wrote:
> Ok,
>I hope this is helpful.  Below please find the git bisect log.
> There were a number of iterations with make errors which I then
> skipped.  I suspect the errors were OS specific and were clearly fixed
> in later iterations.
> 
> -bash-3.2$ git bisect log
> git bisect start
> # bad: [9dbc8974fdd2300a70293eda9c62bce20a3c9165] errormsg may be NULL

  Huh...  Since that commit doesn't help the reported bug, it's likely
best to just revert it.  Oh well.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-18 Thread Craig Campbell
d] Check for undefined 
types, too

git bisect skip 64700e41098a874581d683c8606c94f9ad23079d
# skip: [f4dd3a6e803219b61f3ec1d1b7f3767ee54e8eec] Free tcp structure, too
git bisect skip f4dd3a6e803219b61f3ec1d1b7f3767ee54e8eec
# skip: [382b6c2223ba1a233ca9f4d248beb888a0123f3e] Print more descriptive 
error message for too many EAP sessions

git bisect skip 382b6c2223ba1a233ca9f4d248beb888a0123f3e
# skip: [5aa01c58d91063b5bbbf5aef941137d7cf638bbe] Changed stop packet msg 
to debug rather than error

git bisect skip 5aa01c58d91063b5bbbf5aef941137d7cf638bbe
# skip: [e69be18535bd8b9a2cfb50a9df7cb44e3129ab4c] Added more debugging 
messages

git bisect skip e69be18535bd8b9a2cfb50a9df7cb44e3129ab4c
# skip: [817e64f14df0e5816d87784f995e8fc4a240e048] Initialize proto for 
old-style realms

git bisect skip 817e64f14df0e5816d87784f995e8fc4a240e048
# skip: [d711a368ebf0e057e54596d22584ca2ce37e209c] Make 
client/port/key-balance more like fail-over

git bisect skip d711a368ebf0e057e54596d22584ca2ce37e209c
# skip: [ff89e4cac7f2a9256c7d360b1d53a1eb69a28f40] More plumbing to get to 
home servers via TCP

git bisect skip ff89e4cac7f2a9256c7d360b1d53a1eb69a28f40
# skip: [fe4bf0a8d6d7e168e0c6729115df1315abbe5e20] Fix typo
git bisect skip fe4bf0a8d6d7e168e0c6729115df1315abbe5e20
# skip: [732917380982c0aa5ff862ffa2d901fbe52dac36] Allow radclient to 
send/receive RADIUS over TCP

git bisect skip 732917380982c0aa5ff862ffa2d901fbe52dac36
# skip: [a4202aeb848174ed430fd29573e3dd2db78ae2a1] fix debian/rules to 
honour CFLAGS

git bisect skip a4202aeb848174ed430fd29573e3dd2db78ae2a1
# skip: [6a6d2b450fd7ddff65e9f73bbe96ba3f5f914f08] Check src_port, not 
dst_port

git bisect skip 6a6d2b450fd7ddff65e9f73bbe96ba3f5f914f08
# skip: [30adbf8230730a7503f5e3654df90c5c2a38a8ed] Call detach only if 
function exists

git bisect skip 30adbf8230730a7503f5e3654df90c5c2a38a8ed
# skip: [8fa1a08726aad4f379c7bcc6df608f8d79594a34] Removed recursive 
mutexes.

git bisect skip 8fa1a08726aad4f379c7bcc6df608f8d79594a34
# skip: [ce2a48e678fd80199b886aeda654ed2f94340c19] Allow clients to use TCP
git bisect skip ce2a48e678fd80199b886aeda654ed2f94340c19
-bash-3.2$
- Original Message - 
From: "Alan DeKok" 

To: "FreeRadius users mailing list" 
Sent: Monday, November 16, 2009 11:02 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Craig Campbell wrote:

Still running tests with bisect.

successful runs take some time to identify (a day).

Please let me know if the bug is identified, otherwise I'll keep
plugging away.


 Thanks.  Once we know the commit, the fix should hopefully be easy.

 Alan DeKok.
-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html


__ Information from ESET Smart Security, version of virus 
signature database 4612 (20091116) __


The message was checked by ESET Smart Security.

http://www.eset.com






__ Information from ESET Smart Security, version of virus signature 
database 4617 (20091118) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-16 Thread Alan DeKok
Craig Campbell wrote:
> Still running tests with bisect.
> 
> successful runs take some time to identify (a day).
> 
> Please let me know if the bug is identified, otherwise I'll keep
> plugging away.

  Thanks.  Once we know the commit, the fix should hopefully be easy.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-16 Thread Craig Campbell

Still running tests with bisect.

successful runs take some time to identify (a day).

Please let me know if the bug is identified, otherwise I'll keep plugging 
away.


Thanks,
-craig

- Original Message - 
From: "Alan DeKok" 

To: "FreeRadius users mailing list" 
Sent: Friday, November 06, 2009 5:04 PM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Craig Campbell wrote:

I was able to get some bisect runs (I think).  However, I am
encountering a different error in these.

If radiusd is run in multithreaded mode, it hangs shortly after
beginning. This particular error has already been fixed (later).


 Use a system that supports recursive mutexes.


Do you know if the Signal/Exit error depends upon multi threading?  i.e
will it happen if run with the -s option?


 It depends on multithreading.

 Alan DeKok.
-
List info/subscribe/unsubscribe? See 
http://www.freeradius.org/list/users.html


__ Information from ESET Smart Security, version of virus 
signature database 4580 (20091106) __


The message was checked by ESET Smart Security.

http://www.eset.com






__ Information from ESET Smart Security, version of virus signature 
database 4611 (20091116) __

The message was checked by ESET Smart Security.

http://www.eset.com



-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-04 Thread Alan DeKok
Alexander Clouter wrote:
>  The problem is you *have* to 
> apply my listed cherry-picks, as if you add *any* of the TCP related 
> code Alan has been working on, it all stops compiling[1]

  *Please* use the git "stable" branch.  The "master" branch has a whole
whack of other changes in it which may or may not get into a stable release.

  Much of the work in "stable" has been merged into "master".  But...
the TCP work hasn't.  This is because the re-work in "master" that moves
sockets into loadable modules conflicts with the TCP changes.

  I haven't had the time to go integrate the changes.  And since the
"stable" branch works, it's a low priority.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-04 Thread Alexander Clouter
Alan DeKok  wrote:
>
> Alexander Clouter wrote:
>> It's when I add (I am pretty sure it's the in the first 8 or so 
>> patches) the following I get the same problem with FreeRADIUS:
> ...
>> I guess at this point I am going to be told to be a good boy and run off 
>> and use git bisect? :)
> 
> Pretty much, sorry.
>
It really is bug week for me.  Cisco (x4), FreeRADIUS (x1), Linux (x2), 
etc etc.

Say, I do the git bisect, you will let my ldap xlat dn patch[1] go in, 
I have been patient and waited two years? :)
 
>> Looking through the patches normally I cannot see what could have caused 
>> the graceful exit...which is exactly what I am getting:
> ...
>> #1  0x004228d9 in main (argc=2, argv=0x7fffaaef61c8) at radiusd.c:419
> 
>  That just means that the main event loop exited, and the server is
> telling all child threads to stop.
> 
>  It looks like the server received a TERM, QUIT, or INT signal.  Why, I
> don't know.
>
Yep, that was my take too.  As far as I can tell it just decided to 
gracefully close down which is why when I nosey through the applied 
patches I was hunting for a change in logic flow or something.

>  But yes, "git bisect" would be tremendously useful.  I'm traveling for
> the next week, so I'll have limited time to look at it myself.
> 
Sure thing.  I'll try to find the time tomorrow, however it could take 
a week or so to pin down as I'll need to run for two days to be sure it 
is 'okay'.

Cheers

[1] http://stuff.digriz.org.uk/0001-support-to-get-DN-in-ldap_xlat.patch

-- 
Alexander Clouter
.sigmonster says: Be careful!  Is it classified?

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-04 Thread Alexander Clouter
Craig Campbell  wrote:
> 
>  Thanks for the update - I was concluding I'd have to wait for the release 
> of 2.1.8 to pursue this.  I am currently in a situation where I can help 
> debug 2.1.8, since the 'new' systems aren't yet in production.
> 
Well I can see no reason to run FreeRADIUS no in a debugger all the 
time, even when in production.  However my nickname is "Rambo Clouter" 
so maybe you do not want to follow my advice. :)

When you compile FreeRADIUS you simply make sure you leave 
debugging symbols in and turn off compiler optimisations (so your CFLAGS 
should be '-O0 -g'.  You probably can do this by running configure as 
follows:

CFLAGS='-O0 -g' ./configure --all-your-usual-options-that-you-want


> Looking at your debug output (and I am in no way an expert at that) it seems 
> as though the process received a signal?
>
Well FreeRADIUS is sending it to herself according to gdb:
 src/main/radiusd.c line 419 
/*
 *  Send a TERM signal to all
 *  associated processes
 *  (including us, which gets
 *  ignored.)
 */
#ifndef __MINGW32__
if (spawn_flag) kill(-radius_pid, SIGTERM);
#endif  


For whatever reason, it is not getting ignored.  At first I thought it 
was because I run my FreeRADIUS (even in production) in gdb, but as you 
do not I am wondering what is actually going on.

To run it in the debugger just run 'gdb freeradius' and you will get the 
gdb prompt.  There you want to type 'run -f' and wait for it to puke.  
When it does you could type 'where' for it to tell you what happened, 
but we know what is happening, we want to find which patch is doing it 
:)  Oh familise yourself with screen[2] if you do not know it already, 
you should run the debugger in a screen'd session so you can return to 
it later without having to remain logged in.

> I am running a 'custom' module (event.c as I recall) from Alan that resolves 
> an issue with hung children (very exciting!), and I followed Alan's 
> instructions to get to this point.  I would really like to try to 'give 
> back' if I can and assist in identifying the cause of the program exiting 
> (assuming it is a new and as of yet unidentified bug).
> 
> Would copying the steps you have below on my two redhat systems be a good 
> way to proceed?
> 
Pretty much follow:

http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

I had been running with the cherry-pick'ed patches for weeks and had no 
problems up to 9261f3e0026323b2c397af13d02fbc5780908143, so I am certain 
that the issue is the result of the patches between 
12ead56dffca9b3ecddc8a7860a1ef5b5361b374 and 
9dbc8974fdd2300a70293eda9c62bce20a3c9165.  The problem is you *have* to 
apply my listed cherry-picks, as if you add *any* of the TCP related 
code Alan has been working on, it all stops compiling[1]

Cheers

[1] I am pretty sure Alan has stashed a number of patches that he has 
not put into the publically available GIT trees as things like 
the jumbo socket clean up patch 
(e04b62f1bd257489bd92ccc584b0886c7e2011e8) refer to 
my_ipaddr/my_port which is not in any header files I have or 
found in 'master'
[2] http://blogamundo.net/code/screen/
-- 
Alexander Clouter
.sigmonster says: Simplicity does not precede complexity, but follows it.

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-04 Thread Alan DeKok
Alexander Clouter wrote:
> It's when I add (I am pretty sure it's the in the first 8 or so 
> patches) the following I get the same problem with FreeRADIUS:
...
> I guess at this point I am going to be told to be a good boy and run off 
> and use git bisect? :)

 Pretty much, sorry.

> Looking through the patches normally I cannot see what could have caused 
> the graceful exit...which is exactly what I am getting:
...
> #1  0x004228d9 in main (argc=2, argv=0x7fffaaef61c8) at radiusd.c:419

  That just means that the main event loop exited, and the server is
telling all child threads to stop.

  It looks like the server received a TERM, QUIT, or INT signal.  Why, I
don't know.

  But yes, "git bisect" would be tremendously useful.  I'm traveling for
the next week, so I'll have limited time to look at it myself.

  Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Unexpected "Exiting normally" 2.1.8?

2009-11-04 Thread Craig Campbell

Hi Alexander.

 Thanks for the update - I was concluding I'd have to wait for the release 
of 2.1.8 to pursue this.  I am currently in a situation where I can help 
debug 2.1.8, since the 'new' systems aren't yet in production.


Looking at your debug output (and I am in no way an expert at that) it seems 
as though the process received a signal?
I am running a 'custom' module (event.c as I recall) from Alan that resolves 
an issue with hung children (very exciting!), and I followed Alan's 
instructions to get to this point.  I would really like to try to 'give 
back' if I can and assist in identifying the cause of the program exiting 
(assuming it is a new and as of yet unidentified bug).


Would copying the steps you have below on my two redhat systems be a good 
way to proceed?


Let me know,
-craig
- Original Message - 
From: "Alexander Clouter" 

To: 
Sent: Wednesday, November 04, 2009 11:43 AM
Subject: Re: Unexpected "Exiting normally" 2.1.8?



Craig Campbell  wrote:


I'm running an unreleased '"development?" version of freeradius (2.1.8?).


"me too", I get exactly what you are getting.  If you are always
fiddling with FreeRADIUS I recommend you always run it in gdb as then
you can get things fixed easily.

I usually build FreeRADIUS (under Debian stable) with:

git clone http://git.freeradius.org/freeradius-server.git
cd freeradius-server
git checkout release_2_1_7
git checkout -b soas

git cherry-pick c7a9d2aa1b3fa91591ce95f19aa5ba42c102f4f7
git cherry-pick fbdc02ad699b9bc4d410daaa54f76df7141d2f64
git cherry-pick fa0e98d1056e22fa413078dbd8c3fe0d85532826
git cherry-pick 92ab5fef40320d1dbc3fe59db82cb20a3ec69249
git cherry-pick 4ca219b1f1ab68fc8434072e51a8e4b95cf37c16
git cherry-pick 52880d0020b7b900ae8383b142b08e4e11cde639
git cherry-pick 137e3759b2ffc0c4f99064dadbd7461d3e86ae2a
git cherry-pick 9491d6eb7b963532855ccc8a63a523a2a1e3af2b
git cherry-pick 4baebf8202d7db372a9ad2ce5026ec6c986f0de7
git cherry-pick 382b6c2223ba1a233ca9f4d248beb888a0123f3e
git cherry-pick 751e9a39b2221a2623001a4611021a8e01cf4375
git cherry-pick 1013e94b66064f24170e394e63ba4f093c141d74
git cherry-pick 1628ef2387d9f7a89b3c2ff8945f49777eb135f1
git cherry-pick 83c2cd412b1208e67381372baa73c779cd2848f6
git cherry-pick f6e2dba8a7e4dd31d36d5b8ee434d21600e3f99f
git cherry-pick 64700e41098a874581d683c8606c94f9ad23079d
git cherry-pick e69be18535bd8b9a2cfb50a9df7cb44e3129ab4c
git cherry-pick 9261f3e0026323b2c397af13d02fbc5780908143

DEB_BUILD_OPTIONS='debug nostrip noopt' CFLAGS='-DIE_LIBTOOL_DIE' 
debuild -us -b




It's when I add (I am pretty sure it's the in the first 8 or so
patches) the following I get the same problem with FreeRADIUS:

git cherry-pick 12ead56dffca9b3ecddc8a7860a1ef5b5361b374
git cherry-pick d711a368ebf0e057e54596d22584ca2ce37e209c
git cherry-pick 057c7ac764a4639f715edcbde7dc22491b79be62
git cherry-pick a4202aeb848174ed430fd29573e3dd2db78ae2a1
git cherry-pick a92700b3fb88239ccb0db9f5ece68dd430937df3
git cherry-pick b1e815d0b4bec01f9721d4b92786960b2218f149
git cherry-pick 30adbf8230730a7503f5e3654df90c5c2a38a8ed
git cherry-pick f2d96581f98990d24991c99a681d018a3df85e92
git cherry-pick 5aa01c58d91063b5bbbf5aef941137d7cf638bbe
git cherry-pick 9b70af0c517daad7d374f4cc948488429d3a9cc0
git cherry-pick 98b22609015439b16cc62cf45e4472a14377da2a
git cherry-pick 092f0ea30cdfc2d669afe47061fafb9407269641
git cherry-pick b853a84e6c4ccd5d9e2c4ad9da2c421a234e887f
git cherry-pick d9dd62aae7baa5346f19236cead4414c03546d45
git cherry-pick 1700127c8a7150f57056495a2980fd132dafdb92
git cherry-pick 9dbc8974fdd2300a70293eda9c62bce20a3c9165


I guess at this point I am going to be told to be a good boy and run off
and use git bisect? :)

Looking through the patches normally I cannot see what could have caused
the graceful exit...which is exactly what I am getting:

garibaldi:/usr/src# gdb freeradius
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show
copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
(gdb) run -f
Starting program: /usr/sbin/freeradius -f
[Thread debugging using libthread_db enabled]
[New Thread 0x7f9ba2eeaae0 (LWP 14420)]
[New Thread 0x41313950 (LWP 14423)]
[New Thread 0x4271a950 (LWP 14424)]
[New Thread 0x42f1b950 (LWP 14425)]
[New Thread 0x4371c950 (LWP 14426)]
[New Thread 0x43f1d950 (LWP 14427)]

Program received signal SIGTERM, Terminated.
[Switching to Thread 0x7f9ba2eeaae0 (LWP 14420)]
0x7f9ba171e1c7 in kill () from /lib/libc.so.6
(gdb) bt full
#0  0x7f9ba171e1c7 in kill () from /lib/libc.so.6
No symbol table info available.
#1  0x004228d9 i

Re: Unexpected "Exiting normally" 2.1.8?

2009-11-04 Thread Alexander Clouter
Craig Campbell  wrote:
> 
> I'm running an unreleased '"development?" version of freeradius (2.1.8?).
>
"me too", I get exactly what you are getting.  If you are always 
fiddling with FreeRADIUS I recommend you always run it in gdb as then 
you can get things fixed easily.

I usually build FreeRADIUS (under Debian stable) with:

git clone http://git.freeradius.org/freeradius-server.git
cd freeradius-server
git checkout release_2_1_7
git checkout -b soas

git cherry-pick c7a9d2aa1b3fa91591ce95f19aa5ba42c102f4f7
git cherry-pick fbdc02ad699b9bc4d410daaa54f76df7141d2f64
git cherry-pick fa0e98d1056e22fa413078dbd8c3fe0d85532826
git cherry-pick 92ab5fef40320d1dbc3fe59db82cb20a3ec69249
git cherry-pick 4ca219b1f1ab68fc8434072e51a8e4b95cf37c16
git cherry-pick 52880d0020b7b900ae8383b142b08e4e11cde639
git cherry-pick 137e3759b2ffc0c4f99064dadbd7461d3e86ae2a
git cherry-pick 9491d6eb7b963532855ccc8a63a523a2a1e3af2b
git cherry-pick 4baebf8202d7db372a9ad2ce5026ec6c986f0de7
git cherry-pick 382b6c2223ba1a233ca9f4d248beb888a0123f3e
git cherry-pick 751e9a39b2221a2623001a4611021a8e01cf4375
git cherry-pick 1013e94b66064f24170e394e63ba4f093c141d74
git cherry-pick 1628ef2387d9f7a89b3c2ff8945f49777eb135f1
git cherry-pick 83c2cd412b1208e67381372baa73c779cd2848f6
git cherry-pick f6e2dba8a7e4dd31d36d5b8ee434d21600e3f99f
git cherry-pick 64700e41098a874581d683c8606c94f9ad23079d
git cherry-pick e69be18535bd8b9a2cfb50a9df7cb44e3129ab4c
git cherry-pick 9261f3e0026323b2c397af13d02fbc5780908143

DEB_BUILD_OPTIONS='debug nostrip noopt' CFLAGS='-DIE_LIBTOOL_DIE' debuild -us -b



It's when I add (I am pretty sure it's the in the first 8 or so 
patches) the following I get the same problem with FreeRADIUS:

git cherry-pick 12ead56dffca9b3ecddc8a7860a1ef5b5361b374
git cherry-pick d711a368ebf0e057e54596d22584ca2ce37e209c
git cherry-pick 057c7ac764a4639f715edcbde7dc22491b79be62
git cherry-pick a4202aeb848174ed430fd29573e3dd2db78ae2a1
git cherry-pick a92700b3fb88239ccb0db9f5ece68dd430937df3
git cherry-pick b1e815d0b4bec01f9721d4b92786960b2218f149
git cherry-pick 30adbf8230730a7503f5e3654df90c5c2a38a8ed
git cherry-pick f2d96581f98990d24991c99a681d018a3df85e92
git cherry-pick 5aa01c58d91063b5bbbf5aef941137d7cf638bbe
git cherry-pick 9b70af0c517daad7d374f4cc948488429d3a9cc0
git cherry-pick 98b22609015439b16cc62cf45e4472a14377da2a
git cherry-pick 092f0ea30cdfc2d669afe47061fafb9407269641
git cherry-pick b853a84e6c4ccd5d9e2c4ad9da2c421a234e887f
git cherry-pick d9dd62aae7baa5346f19236cead4414c03546d45
git cherry-pick 1700127c8a7150f57056495a2980fd132dafdb92
git cherry-pick 9dbc8974fdd2300a70293eda9c62bce20a3c9165


I guess at this point I am going to be told to be a good boy and run off 
and use git bisect? :)

Looking through the patches normally I cannot see what could have caused 
the graceful exit...which is exactly what I am getting:

garibaldi:/usr/src# gdb freeradius
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show 
copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
(gdb) run -f
Starting program: /usr/sbin/freeradius -f
[Thread debugging using libthread_db enabled]
[New Thread 0x7f9ba2eeaae0 (LWP 14420)]
[New Thread 0x41313950 (LWP 14423)]
[New Thread 0x4271a950 (LWP 14424)]
[New Thread 0x42f1b950 (LWP 14425)]
[New Thread 0x4371c950 (LWP 14426)]
[New Thread 0x43f1d950 (LWP 14427)]

Program received signal SIGTERM, Terminated.
[Switching to Thread 0x7f9ba2eeaae0 (LWP 14420)]
0x7f9ba171e1c7 in kill () from /lib/libc.so.6
(gdb) bt full
#0  0x7f9ba171e1c7 in kill () from /lib/libc.so.6
No symbol table info available.
#1  0x004228d9 in main (argc=2, argv=0x7fffaaef61c8) at radiusd.c:419
rcode = 0
argval = -1
spawn_flag = 1
dont_fork = 1
flag = 0
act = {__sigaction_handler = {sa_handler = 0x422ab0 , 
sa_sigaction = 0x422ab0 }, sa_mask = {__val = {0 }}, sa_flags = 0,   sa_restorer = 0}
(gdb) where
#0  0x7f9ba171e1c7 in kill () from /lib/libc.so.6
#1  0x004228d9 in main (argc=2, argv=0x7fffaaef61c8) at radiusd.c:419
(gdb) 

(gdb) run -f
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/sbin/freeradius -f
[Thread debugging using libthread_db enabled]
[New Thread 0x7f0874b2bae0 (LWP 14731)]
[New Thread 0x40d60950 (LWP 14732)]
[New Thread 0x41561950 (LWP 14733)]
[New Thread 0x41d62950 (LWP 14734)]
[New Thread 0x42563950 (LWP 14735)]
[New Thread 0x42d64950 (LWP 14736)]

Program received signal SIGTERM, Terminated.
[Switching to Thread 0x7f0874b2bae0 (LWP 14731)]
0x7f087335f1c7 in kill () from /lib/libc.so.6
(gdb) bt full
#0  0x7f087335f1c7 in kill () from /lib/libc.so.6
No symbol table info available.
#1  0x004228d9 in

Unexpected "Exiting normally" 2.1.8?

2009-10-27 Thread Craig Campbell
I'm running an unreleased '"development?" version of freeradius (2.1.8?).

So far it is working well, but it is terminating for reasons I cannot determine.

The log contains the following,

Mon Oct 26 15:48:57 2009 : Info: rlm_sql (sql): Driver rlm_sql_mysql (module 
rlm_sql_mysql) loaded and linked
Mon Oct 26 15:48:57 2009 : Info: rlm_sql (sql): Attempting to connect to 
radi...@localhost:/radius
Mon Oct 26 15:48:57 2009 : Info: rlm_sql_mysql: Starting connect to MySQL 
server for #0
Mon Oct 26 15:48:57 2009 : Info: rlm_sql_mysql: Starting connect to MySQL 
server for #1
Mon Oct 26 15:48:57 2009 : Info: rlm_sql_mysql: Starting connect to MySQL 
server for #2
Mon Oct 26 15:48:57 2009 : Info: rlm_sql_mysql: Starting connect to MySQL 
server for #3
Mon Oct 26 15:48:57 2009 : Info: rlm_sql_mysql: Starting connect to MySQL 
server for #4
Mon Oct 26 15:48:57 2009 : Info: Loaded virtual server inner-tunnel
Mon Oct 26 15:48:57 2009 : Info: Loaded virtual server copy-acct-to-home-server
Mon Oct 26 15:48:57 2009 : Info: Loaded virtual server copy-acct-to-radius-c
Mon Oct 26 15:48:57 2009 : Info: Loaded virtual server 
Mon Oct 26 15:48:57 2009 : Info: Ready to process requests.
Mon Oct 26 17:57:33 2009 : Error: PROXY: Marking home server 192.168.1.226 port 
1813 as zombie (it looks like it is dead).
Mon Oct 26 17:58:13 2009 : Info: PROXY: Marking home server 192.168.1.226 port 
1813 as dead.
Mon Oct 26 20:05:36 2009 : Info: Exiting normally.

The zombie messages are suspicious, since neither host is experiencing any 
significant load. (The zombie server is also 2.1.8.  There is a 2.1.7 server as 
well NOT being zombied..)
The exit message is much later, but no hint as to WHY it is exiting normally.

Any hints would be greatly appreciated.

Thanks,
-craig



Craig Campbell 
craig.campb...@ccraft.ca 
CampbellCraft Consulting Inc
2 Kenny Court 
Whitby, Ontario 
Canada 
L1R 2L8 
905 922-2789 

 



__ Information from ESET Smart Security, version of virus signature 
database 4546 (20091027) __

The message was checked by ESET Smart Security.

http://www.eset.com

-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html