Re: application hangs in STABLE from time to time
Hello! On Fri, 1 Dec 2006, Ganbold wrote: this. Next time use the -o wchan argument to ps to find out what state the process is blocked in. Ok, Here it is: 573 ?? Is 0:00.02 /usr/sbin/inetd -wW -C 60 78721 ?? I 0:00.01 /usr/local/Radiator-3.15/hooks/PSA ^ voiprad# ps -o wchan WCHAN ttyin It's more convenient to use -O here: [EMAIL PROTECTED] ps axO wchan PID WCHAN TT STAT TIME COMMAND 0 - ?? WLs0:00.00 [swapper] 1 wait?? ILs0:00.01 /sbin/init -- 2 - ?? DL 0:00.58 [g_event] 3 - ?? DL 0:11.72 [g_up] 4 - ?? DL 0:18.63 [g_down] 5 crypto ?? DL 0:00.00 [crypto] kgdb /dev/mem /boot/kernel/kernel.symbols I tried with with kernel.debug without success: voiprad# kgdb /dev/mem /usr/obj/usr/src/sys/VOIPRAD/kernel.debug kgdb: bad namelist Reverse the arguments: [EMAIL PROTECTED] kgdb /boot/kernel.debug/kernel.debug /dev/mem kgdb: kvm_nlist(_stopped_cpus): kgdb: kvm_nlist(_stoppcbs): [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] ... (kgdb) info threads 111 Thread 100127 (PID=3196: kgdb) 0xc04aa6fb in sched_switch ( td=0xc6b31900, newtd=0xc35b2600, flags=1) at /usr/RELENG_6/src/sys/kern/sched_4bsd.c:973 110 Thread 100100 (PID=2734: more) 0xc04aa6fb in sched_switch ( td=0xc4dd3d80, newtd=0xc35b2600, flags=1) at /usr/RELENG_6/src/sys/kern/sched_4bsd.c:973 thanks, Ganbold Sincerely, Dmitry -- Atlantis ISP, System Administrator e-mail: [EMAIL PROTECTED] nic-hdl: LYNX-RIPE ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: application hangs in STABLE from time to time
Kris, Kris Kennaway wrote: On Fri, Dec 01, 2006 at 01:02:33PM +0800, Ganbold wrote: Kris Kennaway wrote: On Fri, Nov 24, 2006 at 10:02:23AM +0800, Ganbold wrote: So do I have interrupt storms here and it is something related to bge? No, your interrupts look fine. What else should I check when application hangs again? The most important thing to know is what is the application doing when it hangs. Unfortunately none of the information you provided shows this. Next time use the -o wchan argument to ps to find out what state the process is blocked in. Ok, Here it is: 573 ?? Is 0:00.02 /usr/sbin/inetd -wW -C 60 78721 ?? I 0:00.01 /usr/local/Radiator-3.15/hooks/PSA ^ 78744 ?? Is 0:00.05 sshd: tsgan [priv] (sshd) 78747 ?? S 0:00.02 sshd: [EMAIL PROTECTED] (sshd) 591 v0 Is+0:00.00 /usr/libexec/getty Pc ttyv0 592 v1 Is+0:00.00 /usr/libexec/getty Pc ttyv1 593 v2 Is+0:00.00 /usr/libexec/getty Pc ttyv2 594 v3 Is+0:00.00 /usr/libexec/getty Pc ttyv3 595 v4 Is+0:00.00 /usr/libexec/getty Pc ttyv4 596 v5 Is+0:00.00 /usr/libexec/getty Pc ttyv5 597 v6 Is+0:00.00 /usr/libexec/getty Pc ttyv6 598 v7 Is+0:00.00 /usr/libexec/getty Pc ttyv7 16099 p0- I 20:29.05 perl /usr/local/Radiator-3.15/radiusd -log_file /var/log/radius/logfile -config_file /usr/local/Radiator-3.15/voip.cfg -pid_file / 78748 p0 Is 0:00.01 -sh (sh) 78750 p0 I 0:00.01 su 78751 p0 S 0:00.04 _su (csh) 78761 p0 R+ 0:00.00 ps ax voiprad#ps axHlwww|grep PSA 0 78721 16099 0 4 0 1696 1184 sbwait I ??0:00.01 /usr/local/Radiator-3.15/hooks/PSA voiprad# voiprad# voiprad# ps -o wchan WCHAN ttyin ttyin ttyin ttyin ttyin ttyin ttyin ttyin piperd wait pause - Well, I meant a more complete command than that one ;-) Fortunately it's also included in your previous output above ("sbwait"). This means that the process is waiting for network traffic (usually waiting for another local or remote process to send it data). So it's not obviously pointing to a problem. Remind me again how you know this isn't an application bug (sorry, I've forgotten context)? This application connects to remote mysql-4.0.x server and sends some queries and does some calculations and returns. I tried to run this application from console, and it works fine. It runs from radius server and it works fine serving user access requests except sometimes it hangs. It used to work fine on FreeBSD 5.2-STABLE before upgrading to RELENG-6. So I guess there is something else. You can also use kgdb to find out where it is waiting in the kernel: kgdb /dev/mem /boot/kernel/kernel.symbols info threads thread bt Oh, I don't have kernel.symbols file, how to enable it? It might be in your kernel compilation directory (possibly called kernel.debug). Otherwise, you'll have to build a new kernel and trigger the problem again. I tried with with kernel.debug without success: voiprad# kgdb /dev/mem /usr/obj/usr/src/sys/VOIPRAD/kernel.debug kgdb: bad namelist thanks, Ganbold kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: application hangs in STABLE from time to time
On Fri, Dec 01, 2006 at 01:02:33PM +0800, Ganbold wrote: > Kris Kennaway wrote: > >On Fri, Nov 24, 2006 at 10:02:23AM +0800, Ganbold wrote: > > > > > >>So do I have interrupt storms here and it is something related to bge? > >> > > > >No, your interrupts look fine. > > > > > >>What else should I check when application hangs again? > >> > > > >The most important thing to know is what is the application doing when > >it hangs. Unfortunately none of the information you provided shows > >this. Next time use the -o wchan argument to ps to find out what > >state the process is blocked in. > Ok, Here it is: > > 573 ?? Is 0:00.02 /usr/sbin/inetd -wW -C 60 > 78721 ?? I 0:00.01 /usr/local/Radiator-3.15/hooks/PSA > ^ > 78744 ?? Is 0:00.05 sshd: tsgan [priv] (sshd) > 78747 ?? S 0:00.02 sshd: [EMAIL PROTECTED] (sshd) > 591 v0 Is+0:00.00 /usr/libexec/getty Pc ttyv0 > 592 v1 Is+0:00.00 /usr/libexec/getty Pc ttyv1 > 593 v2 Is+0:00.00 /usr/libexec/getty Pc ttyv2 > 594 v3 Is+0:00.00 /usr/libexec/getty Pc ttyv3 > 595 v4 Is+0:00.00 /usr/libexec/getty Pc ttyv4 > 596 v5 Is+0:00.00 /usr/libexec/getty Pc ttyv5 > 597 v6 Is+0:00.00 /usr/libexec/getty Pc ttyv6 > 598 v7 Is+0:00.00 /usr/libexec/getty Pc ttyv7 > 16099 p0- I 20:29.05 perl /usr/local/Radiator-3.15/radiusd > -log_file /var/log/radius/logfile -config_file > /usr/local/Radiator-3.15/voip.cfg -pid_file / > 78748 p0 Is 0:00.01 -sh (sh) > 78750 p0 I 0:00.01 su > 78751 p0 S 0:00.04 _su (csh) > 78761 p0 R+ 0:00.00 ps ax > > voiprad#ps axHlwww|grep PSA > >0 78721 16099 0 4 0 1696 1184 sbwait I ??0:00.01 > /usr/local/Radiator-3.15/hooks/PSA > > voiprad# > voiprad# > voiprad# ps -o wchan > WCHAN > ttyin > ttyin > ttyin > ttyin > ttyin > ttyin > ttyin > ttyin > piperd > wait > pause > - Well, I meant a more complete command than that one ;-) Fortunately it's also included in your previous output above ("sbwait"). This means that the process is waiting for network traffic (usually waiting for another local or remote process to send it data). So it's not obviously pointing to a problem. Remind me again how you know this isn't an application bug (sorry, I've forgotten context)? > >You can also use kgdb to find out > >where it is waiting in the kernel: > > > >kgdb /dev/mem /boot/kernel/kernel.symbols > >info threads > > > >thread > >bt > > > > Oh, I don't have kernel.symbols file, how to enable it? It might be in your kernel compilation directory (possibly called kernel.debug). Otherwise, you'll have to build a new kernel and trigger the problem again. kris pgpZHjPWfE01b.pgp Description: PGP signature
Re: application hangs in STABLE from time to time
Kris Kennaway wrote: On Fri, Nov 24, 2006 at 10:02:23AM +0800, Ganbold wrote: So do I have interrupt storms here and it is something related to bge? No, your interrupts look fine. What else should I check when application hangs again? The most important thing to know is what is the application doing when it hangs. Unfortunately none of the information you provided shows this. Next time use the -o wchan argument to ps to find out what state the process is blocked in. Ok, Here it is: 573 ?? Is 0:00.02 /usr/sbin/inetd -wW -C 60 78721 ?? I 0:00.01 /usr/local/Radiator-3.15/hooks/PSA ^ 78744 ?? Is 0:00.05 sshd: tsgan [priv] (sshd) 78747 ?? S 0:00.02 sshd: [EMAIL PROTECTED] (sshd) 591 v0 Is+0:00.00 /usr/libexec/getty Pc ttyv0 592 v1 Is+0:00.00 /usr/libexec/getty Pc ttyv1 593 v2 Is+0:00.00 /usr/libexec/getty Pc ttyv2 594 v3 Is+0:00.00 /usr/libexec/getty Pc ttyv3 595 v4 Is+0:00.00 /usr/libexec/getty Pc ttyv4 596 v5 Is+0:00.00 /usr/libexec/getty Pc ttyv5 597 v6 Is+0:00.00 /usr/libexec/getty Pc ttyv6 598 v7 Is+0:00.00 /usr/libexec/getty Pc ttyv7 16099 p0- I 20:29.05 perl /usr/local/Radiator-3.15/radiusd -log_file /var/log/radius/logfile -config_file /usr/local/Radiator-3.15/voip.cfg -pid_file / 78748 p0 Is 0:00.01 -sh (sh) 78750 p0 I 0:00.01 su 78751 p0 S 0:00.04 _su (csh) 78761 p0 R+ 0:00.00 ps ax voiprad#ps axHlwww|grep PSA 0 78721 16099 0 4 0 1696 1184 sbwait I ??0:00.01 /usr/local/Radiator-3.15/hooks/PSA voiprad# voiprad# voiprad# ps -o wchan WCHAN ttyin ttyin ttyin ttyin ttyin ttyin ttyin ttyin piperd wait pause - You can also use kgdb to find out where it is waiting in the kernel: kgdb /dev/mem /boot/kernel/kernel.symbols info threads thread bt Oh, I don't have kernel.symbols file, how to enable it? thanks, Ganbold Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: application hangs in STABLE from time to time
On Fri, Nov 24, 2006 at 10:02:23AM +0800, Ganbold wrote: > So do I have interrupt storms here and it is something related to bge? No, your interrupts look fine. > What else should I check when application hangs again? The most important thing to know is what is the application doing when it hangs. Unfortunately none of the information you provided shows this. Next time use the -o wchan argument to ps to find out what state the process is blocked in. You can also use kgdb to find out where it is waiting in the kernel: kgdb /dev/mem /boot/kernel/kernel.symbols info threads thread bt Kris pgpOGHLk3z26w.pgp Description: PGP signature