Re: FR-1.1.3 on solaris10 strange things
Alexander Serkin <[EMAIL PROTECTED]> wrote: > "Proxy server" instead of "proxy server" in proxy.conf. > So it did not retries and set retry_delay to 0 and so on... Still, values of zero are bad. Alan DeKok. -- http://deployingradius.com - The web site of the book http://deployingradius.com/blog/ - The blog - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
Sorry, sorry, sorry. It's all my fault. "Proxy server" instead of "proxy server" in proxy.conf. So it did not retries and set retry_delay to 0 and so on... -- Sincerely Yours, Alexander - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
Alexander Serkin wrote: Alexander Serkin wrote: ... After that the srings Walking/Waking rapidly appear during dead_time configured in proxy.conf and at the same time the process takes about 50% of CPU on slow netra 1120 (2x440MHz) and up to 99% on Netra-240 (1x1GHz). After dead_time we see: Sorry not after dead_time. After (retry_delay*retry_count). Sorry again. After max_request_time (60s). -- Sincerely Yours, Alexander - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
Alexander Serkin пишет: Alan DeKok пишет: Alexander Serkin <[EMAIL PROTECTED]> wrote: May be someone could give an advice how to debug the problem while the server will not be in production? Attach to it with gdb, and see what it's doing. Got some debugs on this. The problem does not depend on solaris version - both 9 and 10 have the same effects. The effect rises up when the request is proxied to other server and this server does not answer: ... After that the srings Walking/Waking rapidly appear during dead_time configured in proxy.conf and at the same time the process takes about 50% of CPU on slow netra 1120 (2x440MHz) and up to 99% on Netra-240 (1x1GHz). After dead_time we see: Sorry not after dead_time. After (retry_delay*retry_count). -- Sincerely Yours, Alexander - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
Alan DeKok пишет: Alexander Serkin <[EMAIL PROTECTED]> wrote: May be someone could give an advice how to debug the problem while the server will not be in production? Attach to it with gdb, and see what it's doing. Got some debugs on this. The problem does not depend on solaris version - both 9 and 10 have the same effects. The effect rises up when the request is proxied to other server and this server does not answer: rad_recv: Access-Request packet from host 127.0.0.1:34653, id=69, length=81 User-Name = "mobile" User-Password = "internet" Calling-Station-Id = "999" Framed-Protocol = PPP Service-Type = Framed-User NAS-IP-Address = 212.119.97.85 rad_lowerpair: User-Name now 'mobile' Processing the authorize section of radiusd.conf modcall: entering group authorize for request 0 modcall[authorize]: module "preprocess" returns ok for request 0 modcall[authorize]: module "chap" returns noop for request 0 rlm_realm: No '@' in User-Name = "mobile", looking up realm NULL rlm_realm: Found realm "NULL" rlm_realm: Adding Stripped-User-Name = "mobile" rlm_realm: Proxying request from user mobile to realm NULL rlm_realm: Adding Realm = "NULL" rlm_realm: Authentication realm is LOCAL. modcall[authorize]: module "suffix" returns noop for request 0 users: Matched entry DEFAULT at line 156 modcall[authorize]: module "files" returns ok for request 0 radius_xlat: 'mobile' rlm_sql (sqlauth): sql_set_user escaped user --> 'mobile' radius_xlat: 'SELECT id,UserName,Attribute,Value,op FROM radcheck WHERE Username = 'mobile' ORDER BY id' rlm_sql (sqlauth): Reserving sql socket id: 4 radius_xlat: 'SELECT radgroupcheck.id,radgroupcheck.GroupName,radgroupcheck.Attribute,radgroupcheck.Value,radgroupcheck.op FROM radgroupcheck,usergroup WHERE (usergroup.Username = 'mobile' or usergroup.CLID = '999') AND usergroup.GroupName = radgroupcheck.GroupName ORDER BY usergroup.PRIORITY,radgroupcheck.id' radius_xlat: 'SELECT id,UserName,Attribute,Value,op FROM radreply WHERE Username = 'mobile' ORDER BY id' radius_xlat: 'SELECT radgroupreply.id,radgroupreply.GroupName,radgroupreply.Attribute,radgroupreply.Value,radgroupreply.op FROM radgroupreply,usergroup WHERE (usergroup.Username = 'mobile' OR usergroup.CLID = '999') AND usergroup.GroupName = radgroupreply.GroupName ORDER BY radgroupreply.id' rlm_sql (sqlauth): Released sql socket id: 4 modcall[authorize]: module "sqlauth" returns ok for request 0 modcall[authorize]: module "mschap" returns noop for request 0 modcall: leaving group authorize (returns ok) for request 0 Sending Access-Request of id 0 to 212.119.96.99 port 1812 User-Name = "mobile" User-Password = "internet" Calling-Station-Id = "999" Framed-Protocol = PPP Service-Type = Framed-User NAS-IP-Address = 212.119.97.85 Proxy-State = 0x3639 --- Walking the entire request list --- Waking up in 1 seconds... --- Walking the entire request list --- Waking up in 0 seconds... After that the srings Walking/Waking rapidly appear during dead_time configured in proxy.conf and at the same time the process takes about 50% of CPU on slow netra 1120 (2x440MHz) and up to 99% on Netra-240 (1x1GHz). After dead_time we see: Waking up in 0 seconds... --- Walking the entire request list --- Rejecting request 0 due to lack of any response from home server localhost:34653 Server rejecting request 0. Waking up in 0 seconds... --- Walking the entire request list --- Sending Access-Reject of id 69 to 127.0.0.1 port 34653 Cleaning up request 0 ID 69 with timestamp 45596c9d Nothing to do. Sleeping until we see a request. --- Walking the entire request list --- Nothing to do. Sleeping until we see a request. I do not understand why it says "home server localhost" while the request was proxied to home server 212.119.96.99? May be i have some incorrect configuration in the proxy.conf? proxy.conf: Proxy server { synchronous = no retry_delay = 5 retry_count = 3 dead_time = 15 default_fallback = no } realm DUMMY { type= radius authhost= 212.119.96.99:1812 accthost= 212.119.96.99:1813 secret = secret nostrip } -- Sincerely Yours, Alexander - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
On Wed, 2006-11-08 at 14:56 -0500, Alan DeKok wrote: > Alexander Serkin <[EMAIL PROTECTED]> wrote: > > May be someone could give an advice how to debug the problem while the > > server will not be in production? > > Attach to it with gdb, and see what it's doing. > Or use the 'truss' command to see what is going on. John. -- --- John Horne, University of Plymouth, UK Tel: +44 (0)1752 233914 E-mail: [EMAIL PROTECTED] Fax: +44 (0)1752 233839 - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
Alexander Serkin <[EMAIL PROTECTED]> wrote: > May be someone could give an advice how to debug the problem while the > server will not be in production? Attach to it with gdb, and see what it's doing. Alan DeKok. -- http://deployingradius.com - The web site of the book http://deployingradius.com/blog/ - The blog - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
Alan DeKok wrote: Alexander Serkin <[EMAIL PROTECTED]> wrote: We have strange behaviour on sparc solaris 10 server with fr-1.1.3 installed: without any visible reason the radiusd process goes to almost 100% CPU usage for 3-5 minutes. Then it comes back to normal state again (less than 1% CPU). Yuck. I don't run Solaris, so I can't comment more than that... It looks like a busy loop somewhere, probably in the main socket handling code. We'll run a second instance on another netra soon. May be someone could give an advice how to debug the problem while the server will not be in production? -- Sincerely Yours, Alexander - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: FR-1.1.3 on solaris10 strange things
Alexander Serkin <[EMAIL PROTECTED]> wrote: > We have strange behaviour on sparc solaris 10 server with fr-1.1.3 > installed: > without any visible reason the radiusd process goes to almost 100% CPU > usage for 3-5 minutes. Then it comes back to normal state again (less > than 1% CPU). Yuck. I don't run Solaris, so I can't comment more than that... It looks like a busy loop somewhere, probably in the main socket handling code. Alan DeKok. -- http://deployingradius.com - The web site of the book http://deployingradius.com/blog/ - The blog - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
FR-1.1.3 on solaris10 strange things
Hi. We have strange behaviour on sparc solaris 10 server with fr-1.1.3 installed: without any visible reason the radiusd process goes to almost 100% CPU usage for 3-5 minutes. Then it comes back to normal state again (less than 1% CPU). Visually the 100% CPU load does not impact the system funcionality - there are no problems with authentication/accounting processing. The server is not hard loaded - there are not more than 2-3 requests per second on it. prstat output reports: PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 757 radius 93M 10M run 400 0:56:05 99% radiusd/18 and "prstat -vm" : PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP 757 radius 4.5 1.1 0.0 0.0 0.0 93 0.2 1.6 65 315 .24 0 radiusd/18 has anybody seen this? What can be the reason? Previously it was run on Netra-1120 with solaris 9, the subject appeared after moving to netra-240 Sol10: 5.10 Generic sun4u sparc SUNW,Netra-240 -- Alexander - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html