Hi Dan, The vital information is the backtrace (with gdb) of the FIFO process while blocked = as there is not CPU usage, I say the procs stuck in a IO op. BTW, try to run the test with debug=6 (if possible).
If you need help with the gdb syntax, let me know. Regards, Bogdan DanB wrote: > Hey Bogdan, > > Hereby some more tests (I should mention that I cannot block on demand > the fifo, it simply randomly blocks, so I must be lucky when I am > trying to reproduce it). > > Based on ps the cpu and memory consuption do not increase during the > hang (CPU:0.0 Memory:1.0). > > Ta, > DanB > > Log of the actions: > ------------------------------------------------------------------- > > Before reload: > > sip1:~# opensipsctl ps|grep FIFO > > Process:: ID=13 PID=26368 Type=MI FIFO > > sip1:~# ps uww -p 26368 > > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > opensips 26368 0.0 1.0 1147236 10788 ? S 07:30 0:00 > /usr/sbin/opensips -P /var/run/opensips/opensips.pid -m 1024 -u opensips -g > opensips > > sip1:~# > > > > During hanging reload: > > sip1:~# ps uww -p 26368 > > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > opensips 26368 0.0 1.0 1147644 11040 ? R 07:30 0:00 > /usr/sbin/opensips -P /var/run/opensips/opensips.pid -m 1024 -u opensips -g > opensips > > > After reload interrupted with crtl+c and /etc/init.d/opensips restart > > > sip1:~# ps uww -p 26368 > > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > sip1:~# opensipsctl ps|grep FIFO > > Process:: ID=13 PID=26716 Type=MI FIFO > > sip1:~# ps uww -p 26716 > > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > opensips 26716 0.0 0.1 1147164 1988 ? S 14:59 0:00 > /usr/sbin/opensips -P /var/run/opensips/opensips.pid -m 1024 -u opensips -g > opensips > > On Mon, Apr 26, 2010 at 12:25 PM, Bogdan-Andrei Iancu > <bog...@voice-system.ro> wrote: > >> Hi Dan, >> >> Your descriptions point to a blocked fifo process. Blocking maybe >> because of some internal locking (you see 99% cpu usage) or some I/O >> (normal cpu usage). >> >> So, do the followings: >> >> 1) do "opensipsctl fifo ps" to see the PID of the fifo process >> 2) make fifo to block >> 3) check if the fifo process (by pid) is there - if yes, see how much >> cpu it uses and try to attache with gdb to it to get a backtrace. >> >> Regards, >> Bogdan >> >> DanB wrote: >> >>> Hey Bogdan, >>> >>> Thanks for coming back so fast. >>> >>> There was no error reported neither on console nor in the syslog >>> (debug 7). I will need to check for dead process since all I could >>> spot was no reply back and console hanging, and be able to stop it >>> only with Ctrl+C. After Ctrl+C could not get the any other opensipsctl >>> commands to work. >>> Will need to wait few days more to know about dead process. >>> >>> DanB >>> >>> On Thu, Apr 22, 2010 at 6:55 PM, Bogdan-Andrei Iancu >>> <bog...@voice-system.ro> wrote: >>> >>> >>>> Hi Dan, >>>> >>>> Did you notice any error from the fifo process during the reload ? it >>>> may be something related to locking (during reload) of the table -> this >>>> may affect all the other processes. >>>> >>>> So, any errors? any dead processes (like fifo one) ? >>>> >>>> Regards, >>>> Bogdan >>>> >>>> DanB wrote: >>>> >>>> >>>>> Hey Guys, >>>>> >>>>> I have noticed the "opensipsctl dialplan reload" command randomly >>>>> hanging, sometimes even the server itself becoming non responsible, >>>>> other times reloading the dialplan into memory but not reporting >>>>> anything on console, the last one becoming unusable until server >>>>> restart. This happened in the past as well but with the traffic >>>>> increase, it becomes more and more annoying. I suspect the same bug >>>>> which was present in the past with fifo hanging. >>>>> I should mention that I got about 2000 records in the dialplan table, >>>>> so I would say not that much loaded. >>>>> >>>>> The version I am running: >>>>> sip1:/home/employee/dan# opensips -V >>>>> version: opensips 1.6.1-notls (x86_64/linux) >>>>> flags: STATS: Off, USE_IPV6, USE_TCP, DISABLE_NAGLE, USE_MCAST, >>>>> SHM_MEM, SHM_MMAP, PKG_MALLOC, F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT >>>>> ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, >>>>> MAX_URI_SIZE 1024, BUF_SIZE 65535 >>>>> poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. >>>>> svnrevision: 2:6509M >>>>> @(#) $Id: main.c 6169 2009-09-22 12:48:37Z bogdan_iancu $ >>>>> main.c compiled on 14:43:30 Jan 11 2010 with gcc 4.3.2 >>>>> >>>>> >>>>> All I could find in the logs was database reconnection in the case of >>>>> hang but no error or something else reported (running debug 7). >>>>> Bellow some of the log: >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:mi_fifo:mi_fifo_server: entered consume >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:mi_fifo:mi_fifo_server: **** done consume >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:mi_fifo:mi_fifo_server: done parsing the mi tree >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: DBG:dialplan:dp_load_db: >>>>> init >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> INFO:db_mysql:db_mysql_submit_query: disconect event for 0x77f060 >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> INFO:db_mysql:reset_all_statements: reseting all statements on >>>>> connection: (0x77fd18) 0x77f060 >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_connect: opening connection: >>>>> mysql://xxxx:x...@192.168.11.253/sipeandb >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_connect: connection type is 192.168.11.253 via >>>>> TCP/IP >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_connect: protocol version is 10 >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_connect: server version is 5.0.51a-24+lenny1-log >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> INFO:db_mysql:db_mysql_submit_query: re-connected successful for >>>>> 0x77f060 >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:core:db_new_result: allocate 48 bytes for result set at 0x784258 >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: 8 columns returned from the query >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:core:db_allocate_columns: allocate 224 bytes for result columns at >>>>> 0x7842a0 >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x7842e0)[0]=[dpid] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_INT result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x7842f0)[1]=[pr] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_INT result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x784300)[2]=[match_op] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_INT result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x784310)[3]=[match_exp] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_STRING result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x784320)[4]=[match_len] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_INT result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x784330)[5]=[subst_exp] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_STRING result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x784340)[6]=[repl_exp] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_STRING result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: RES_NAMES(0x784350)[7]=[attrs] >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_get_columns: use DB_STRING result type >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:db_mysql:db_mysql_fetch_result: converting row 0 of 2042 count 500 >>>>> Apr 22 14:27:19 sip1 /usr/sbin/opensips[16299]: >>>>> DBG:core:db_allocate_rows: allocate 136000 bytes for result rows and >>>>> values at 0x79c5d8 >>>>> >>>>> Ta, >>>>> DanB >>>>> >>>>> _______________________________________________ >>>>> Users mailing list >>>>> Users@lists.opensips.org >>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>>>> >>>>> >>>>> >>>>> >>>> -- >>>> Bogdan-Andrei Iancu >>>> www.voice-system.ro >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> Users@lists.opensips.org >>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>>> >>>> >>>> >>> _______________________________________________ >>> Users mailing list >>> Users@lists.opensips.org >>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>> >>> >>> >> -- >> Bogdan-Andrei Iancu >> www.voice-system.ro >> >> >> _______________________________________________ >> Users mailing list >> Users@lists.opensips.org >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >> >> > > _______________________________________________ > Users mailing list > Users@lists.opensips.org > http://lists.opensips.org/cgi-bin/mailman/listinfo/users > > -- Bogdan-Andrei Iancu www.voice-system.ro _______________________________________________ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users