Hi David, Cool, so my guess was right :) . Thanks for the update.
Regards, Bogdan David Cunningham wrote: > Hi Bogdan, > > Just to let you know, we traced the problem to the Perl code. Thank > you for your help! > > > On Mon, May 3, 2010 at 8:48 AM, Bogdan-Andrei Iancu > <bog...@voice-system.ro> wrote: > >> Hi David, >> >> Based on the "ps" output, it seams that the zombies processes were >> forked by opensips worker processes - this does not happen only when >> using the exec module (which you do not have) - the only alternative is >> that the perl scripts you are using are doing the fork (maybe some perl >> function?) and do not properly terminate the extra procs... >> >> Regards, >> Bogdan >> >> David Cunningham wrote: >> >>> Hello, >>> >>> Certainly, here they are from opensips.cfg and I've included the >>> modparam in case they help: >>> >>> loadmodule "db_mysql.so" >>> loadmodule "sl.so" >>> loadmodule "tm.so" >>> loadmodule "usrloc.so" >>> loadmodule "auth.so" >>> loadmodule "auth_db.so" >>> loadmodule "maxfwd.so" >>> loadmodule "mi_fifo.so" >>> loadmodule "nathelper.so" >>> loadmodule "perl.so" >>> loadmodule "registrar.so" >>> loadmodule "rr.so" >>> loadmodule "textops.so" >>> loadmodule "uri.so" >>> >>> modparam( "auth", "nonce_expire", 30 ) >>> modparam( "auth_db|domain|uri_db|usrloc", "db_url", "mysql://foo" ) >>> modparam( "auth_db", "calculate_ha1", yes ) >>> modparam( "auth_db", "password_column", "secret" ) >>> modparam( "auth_db", "use_domain", 0 ) >>> modparam( "auth_db", "user_column", "name" ) >>> modparam( "mi_fifo", "fifo_name", "/tmp/opensips_fifo" ) >>> modparam( "nathelper", "natping_interval", 240 ) >>> modparam( "nathelper", "ping_nated_only", 1 ) >>> modparam( "nathelper", "sipping_bflag", 1 ) >>> modparam( "nathelper", "sipping_from", "sip:keepal...@foo" ) >>> modparam( "nathelper|registrar", "received_avp", "$avp(i:42)" ) >>> modparam( "perl", "filename", "/path/to/OpenSIPS.pm" ) >>> modparam( "perl", "modpath", "/path/to/perllib" ) >>> modparam( "registrar", "append_branches", 1 ) >>> modparam( "rr", "enable_full_lr", 1 ) >>> modparam( "usrloc", "db_mode", 2 ) >>> modparam( "usrloc", "desc_time_order", 1 ) >>> modparam( "usrloc", "nat_bflag", 1 ) >>> modparam( "usrloc", "timer_interval", 5 ) >>> >>> >>> Thank you! >>> >>> On Wed, Apr 28, 2010 at 4:53 PM, Bogdan-Andrei Iancu >>> <bog...@voice-system.ro> wrote: >>> >>> >>>> Hi David, >>>> >>>> by chance, using the "exec" module ? >>>> >>>> Or, can you list the modules you are using ? >>>> >>>> Regards, >>>> Bogdan >>>> >>>> David Cunningham wrote: >>>> >>>> >>>>> Hello, >>>>> >>>>> Thank you for the reply. I checked the parent of the zombie processes, >>>>> and they seem to be "SIP receiver" processes as per the following "ps >>>>> -ef" extract and "opensipsctl fifo ps" information. >>>>> We're not running the "respawn" patch. >>>>> >>>>> Any more advice very welcome, thanks again! >>>>> >>>>> >>>>> user 5830 1 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5832 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5833 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5834 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5835 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5836 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5837 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5838 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5839 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5840 5830 0 06:38 ? 00:00:01 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5841 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5842 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5843 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5844 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5845 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5846 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5847 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5848 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5849 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5850 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 5851 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P >>>>> /var/run/user/opensips.pid >>>>> user 7260 5833 0 08:30 ? 00:00:00 [opensips] <defunct> >>>>> user 7261 5833 0 08:30 ? 00:00:00 [opensips] <defunct> >>>>> user 7262 5833 0 08:30 ? 00:00:00 [opensips] <defunct> >>>>> user 7263 5833 0 08:30 ? 00:00:00 [opensips] <defunct> >>>>> user 7264 5833 0 08:30 ? 00:00:00 [opensips] <defunct> >>>>> user 7265 5833 0 08:30 ? 00:00:00 [opensips] <defunct> >>>>> user 9770 5835 0 08:37 ? 00:00:00 [opensips] <defunct> >>>>> user 9771 5835 0 08:37 ? 00:00:00 [opensips] <defunct> >>>>> user 9772 5835 0 08:37 ? 00:00:00 [opensips] <defunct> >>>>> user 9838 5834 0 08:38 ? 00:00:00 [opensips] <defunct> >>>>> user 9839 5834 0 08:38 ? 00:00:00 [opensips] <defunct> >>>>> user 15519 5833 0 08:57 ? 00:00:00 [opensips] <defunct> >>>>> user 15520 5833 0 08:57 ? 00:00:00 [opensips] <defunct> >>>>> user 15521 5833 0 08:57 ? 00:00:00 [opensips] <defunct> >>>>> user 15522 5833 0 08:57 ? 00:00:00 [opensips] <defunct> >>>>> >>>>> >>>>> [r...@hostname ~]# opensipsctl fifo ps >>>>> Process:: ID=0 PID=5830 Type=attendant >>>>> Process:: ID=1 PID=5832 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=2 PID=5833 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=3 PID=5834 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=4 PID=5835 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=5 PID=5836 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=6 PID=5837 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=7 PID=5838 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=8 PID=5839 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060 >>>>> Process:: ID=9 PID=5840 Type=timer >>>>> Process:: ID=10 PID=5841 Type=timer >>>>> Process:: ID=11 PID=5842 Type=MI FIFO >>>>> Process:: ID=12 PID=5843 Type=TCP receiver >>>>> Process:: ID=13 PID=5844 Type=TCP receiver >>>>> Process:: ID=14 PID=5845 Type=TCP receiver >>>>> Process:: ID=15 PID=5846 Type=TCP receiver >>>>> Process:: ID=16 PID=5847 Type=TCP receiver >>>>> Process:: ID=17 PID=5848 Type=TCP receiver >>>>> Process:: ID=18 PID=5849 Type=TCP receiver >>>>> Process:: ID=19 PID=5850 Type=TCP receiver >>>>> Process:: ID=20 PID=5851 Type=TCP main >>>>> >>>>> >>>>> >>>>> On Mon, Apr 26, 2010 at 12:22 PM, Bogdan-Andrei Iancu >>>>> <bog...@voice-system.ro> wrote: >>>>> >>>>> >>>>> >>>>>> Hi David, >>>>>> >>>>>> Let's try and see what's the parent process of the zombie procs -> check >>>>>> with ps and correlate (for the name) with "opensipsctl fifo ps" >>>>>> >>>>>> I guess the parent of the zombies should be the "attendant proc" . BTW, >>>>>> are you running with the "respawn" patch ? >>>>>> >>>>>> Regards, >>>>>> Bogdan >>>>>> >>>>>> David Cunningham wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Thanks again for your assistance! >>>>>>> >>>>>>> We're not using the mi_xmlrpc module. >>>>>>> >>>>>>> Were you suggesting using gdb on the zombi process? I tried and got >>>>>>> the following: >>>>>>> >>>>>>> user 31183 12140 0 10:31 ? 00:00:00 [opensips] <defunct> >>>>>>> [r...@sip01 ~]# gdb /sbin/opensips 31183 >>>>>>> GNU gdb Fedora (6.8-37.el5) >>>>>>> Copyright (C) 2008 Free Software Foundation, Inc. >>>>>>> License GPLv3+: GNU GPL version 3 or later >>>>>>> <http://gnu.org/licenses/gpl.html> >>>>>>> This is free software: you are free to change and redistribute it. >>>>>>> There is NO WARRANTY, to the extent permitted by law. Type "show >>>>>>> copying" >>>>>>> and "show warranty" for details. >>>>>>> This GDB was configured as "x86_64-redhat-linux-gnu"... >>>>>>> Attaching to program: /sbin/opensips, process 31183 >>>>>>> ptrace: Operation not permitted. >>>>>>> /root/31183: No such file or directory. >>>>>>> >>>>>>> We havn't tested 1.6 in production but might be willing to go that >>>>>>> road if you're confident it will solve our problems. >>>>>>> >>>>>>> Much appreciate your help. >>>>>>> >>>>>>> >>>>>>> On Thu, Apr 22, 2010 at 6:07 PM, Bogdan-Andrei Iancu >>>>>>> <bog...@voice-system.ro> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>> David Cunningham wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Thank you for the reply! >>>>>>>>> >>>>>>>>> The log doesn't say anything useful, just "Listening on" and then the >>>>>>>>> UDP and TCP IP address and port, and "Aliases" also with UDP and TCP >>>>>>>>> addresses and ports. I did set "debug = 9" in >>>>>>>>> /etc/opensips/opensips.cfg but this caused all phones registered with >>>>>>>>> OpenSIPS to give "NO SERVICE" and we disabled debugging immediately. >>>>>>>>> It's a busy system. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> do not do that again - full debug slows down your system ! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> The defunct processes don't just happen when shutting OpenSIPS down >>>>>>>>> either - they are building up while OpenSIPS is running, at a rate of >>>>>>>>> about one every few minutes. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> by chance, are you using mi_xmlrpc module ? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> I'm not sure how to get the information I need with gdb. I attached to >>>>>>>>> the attendant process (7811) and ran 'bt' which gave the following: >>>>>>>>> >>>>>>>>> (gdb) bt >>>>>>>>> #0 0x00000030e9298570 in __pause_nocancel () from /lib64/libc.so.6 >>>>>>>>> #1 0x0000000000426f1a in main (argc=5, argv=0x7fff771e8c08) at >>>>>>>>> main.c:867 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> That indicates the attendant processes - is this one in zombi state >>>>>>>> too ? >>>>>>>> important is to check the zombi procs. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Can anyone point me where to go from here - maybe advice on what gdb >>>>>>>>> commands would help? >>>>>>>>> >>>>>>>>> I should have mentioned that this is on version 1.4.5-notls. We >>>>>>>>> originally saw it on 1.4.3-notls and upgraded to try and fix it. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> I would strongly recommend to update to 1.6. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Bogdan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Thanks in advance! >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Apr 21, 2010 at 8:55 AM, Bogdan-Andrei Iancu >>>>>>>>> <bog...@voice-system.ro> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> the defunct procs seams to be the children of a still running >>>>>>>>>> opensips >>>>>>>>>> proc - this may be the attendant process which, for whatever reasons >>>>>>>>>> is >>>>>>>>>> not stopping (after killing the children procs). >>>>>>>>>> >>>>>>>>>> Checks what this process is doing (see top, try attaching with gdb). >>>>>>>>>> >>>>>>>>>> Also, does the log say something? errors? shutdown triggered? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Bogdan >>>>>>>>>> >>>>>>>>>> David Cunningham wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> We have a server which is creating a lot of defunct OpenSIPS >>>>>>>>>>> processes. An example process tree is below (from ps -ef --forest). >>>>>>>>>>> >>>>>>>>>>> I have no idea where to start looking for the cause of this. Any >>>>>>>>>>> suggestions very welcome! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>> >>>>> >> -- >> Bogdan-Andrei Iancu >> www.voice-system.ro >> >> >> _______________________________________________ >> Users mailing list >> Users@lists.opensips.org >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >> >> > > > > -- Bogdan-Andrei Iancu www.voice-system.ro _______________________________________________ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users