Re: [SR-Users] Kamailio stops responding after 10 days or so
Hello, thanks for feedback. Going to work on it asap. Cheers, Daniel On 4/19/11 10:17 PM, Morten Isaksen wrote: Hi, It happend again. I got the result from gdb. Here is the output from a bt indside gdb for each worker and one bt full for one of the workers. They are all stuck in futexlock.h My C debuging skills are a bit rusty so I hope one of you have some ideas. /Morten (gdb) bt #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaf17b2e0d in futex_get (t=, type=, ps=0x7fffafb7d7a0) at ../../mem/../futexlock.h:113 #2 publ_cback_func (t=, type=, ps=0x7fffafb7d7a0) at send_publish.c:272 #3 0x2acaeeb97815 in run_trans_callbacks_internal (cb_lst=0x2acaf3cfd260, type=256, trans=0x2acaf3cfd1f0, params=0x7fffafb7d7a0) at t_hooks.c:290 #4 0x2acaeeb97a6e in run_trans_callbacks (type=0, trans=0x2, req=0x0, rpl=0x246, code=0) at t_hooks.c:317 #5 0x2acaeebbfcb3 in local_reply (t=0x2acaf3cfd1f0, p_msg=, branch=0, msg_status=, cancel_bitmap=) at t_reply.c:1847 #6 0x2acaeebc29e8 in reply_received (p_msg=0x8d36c8) at t_reply.c:2133 #7 0x0044780e in forward_reply (msg=0x8d36c8) at forward.c:689 #8 0x0047ff02 in receive_msg ( buf=0x874a40 "SIP/2.0 200 OK\r\nVia: SIP/2.0/UDP 178.21.248.8;branch=z9hG4bKcd3a.d99d6403.0\r\nTo: sip:+380432571...@sip.uni-tel.dk;user=phone;tag=4e546d1f8c10c6f99af9c51d895c9c87-c9ca\r\nFrom: sip:+380432571...@sip.uni-"..., len=, rcv_info=0x7fffafb7dd20) at receive.c:257 #9 0x005067ab in udp_rcv_loop () at udp_server.c:520 #10 0x00455cdf in main_loop () at main.c:1447 #11 0x00456de2 in main (argc=, argv=0x7fffafb7dfe8) at main.c:2251 (gdb) bt #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaeeb7ea32 in futex_get (i=) at ../../mem/../futexlock.h:100 #2 _lock (i=) at lock.h:98 #3 lock_hash (i=) at h_table.c:98 #4 0x2acaeeba1b99 in t_lookup_request (p_msg=0x8d36c8, leave_new_locked=0, cancel=0x7fffafb7d5e0) at t_lookup.c:548 #5 0x2acaeeba3532 in t_check_msg (p_msg=0x8d36c8, param_branch=) at t_lookup.c:1104 #6 0x2acaeebab7b1 in t_check_trans (msg=0x2acaf1f4b2e8, foo=, bar=0x2) at tm.c:1881 #7 0x0041345c in do_action (h=0x7fffafb7dab0, a=0x8a6f98, msg=0x8d36c8) at action.c:860 #8 0x00415d2b in run_actions (h=0x7fffafb7dab0, a=0x89f8d8, msg=0x8d36c8) at action.c:1293 #9 0x00416084 in run_top_route (a=0x89f8d8, msg=0x8d36c8, c=) at action.c:1341 #10 0x0047ff4c in receive_msg ( buf=0x874a40 "INVITE sip:20322595@178.21.248.8:5060;transport=udp SIP/2.0\r\nRecord-Route: \r\nVia: SIP/2.0/UDP 178.21.248.20;branch=z9hG4bKdd3a.3d0516e1.0\r\nVia: SIP/2.0/UDP 81.27."..., len=, rcv_info=0x7fffafb7dd20) at receive.c:196 #11 0x005067ab in udp_rcv_loop () at udp_server.c:520 #12 0x00455cdf in main_loop () at main.c:1447 #13 0x00456de2 in main (argc=, argv=0x7fffafb7dfe8) at main.c:2251 #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaeeb7ea32 in futex_get (i=) at ../../mem/../futexlock.h:100 #2 _lock (i=) at lock.h:98 #3 lock_hash (i=) at h_table.c:98 #4 0x2acaeeba1b99 in t_lookup_request (p_msg=0x8d36c8, leave_new_locked=0, cancel=0x7fffafb7d5e0) at t_lookup.c:548 #5 0x2acaeeba3532 in t_check_msg (p_msg=0x8d36c8, param_branch=) at t_lookup.c:1104 #6 0x2acaeebab7b1 in t_check_trans (msg=0x2acaf1f4b2e8, foo=, bar=0x2) at tm.c:1881 #7 0x0041345c in do_action (h=0x7fffafb7dab0, a=0x8a6f98, msg=0x8d36c8) at action.c:860 #8 0x00415d2b in run_actions (h=0x7fffafb7dab0, a=0x89f8d8, msg=0x8d36c8) at action.c:1293 #9 0x00416084 in run_top_route (a=0x89f8d8, msg=0x8d36c8, c=) at action.c:1341 #10 0x0047ff4c in receive_msg ( buf=0x874a40 "INVITE sip:20322595@178.21.248.8:5060;transport=udp SIP/2.0\r\nRecord-Route: \r\nVia: SIP/2.0/UDP 178.21.248.20;branch=z9hG4bKdd3a.3d0516e1.0\r\nVia: SIP/2.0/UDP 81.27."..., len=, rcv_info=0x7fffafb7dd20) at receive.c:196 #11 0x005067ab in udp_rcv_loop () at udp_server.c:520 #12 0x00455cdf in main_loop () at main.c:1447 #13 0x00456de2 in main (argc=, argv=0x7fffafb7dfe8) at main.c:2251 (gdb) bt #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaeeb7e9fa in futex_get (i=) at ../../mem/../futexlock.h:113 #2 _lock (i=) at lock.h:98 #3 lock_hash (i=) at h_table.c:98 #4 0x2acaeebca0e3 in t_uac_prepare (uac_r=0x7fffafb7c270, dst_req=0x7fffafb7c060, dst_cell=0x7fffafb7c058) at uac.c:319 #5 0x2acaeebcb4ae in t_uac_with_ids (uac_r=0x2acaf1f4b2e8, ret_index=0x0, ret_label=0x0) at uac.c:531 #6 0x2acaeebcc9c7 in request (uac_r=0x7fffafb7c270, ruri=0x11, to=0x1b, from=0x8e0fe0, next_hop=0x2acaf19c01c0) at uac.c:778 #7 0x2acaf17af3ff in send_publish (publ=) at send_publish.c:556 #8 0x2acaf19c4f67 in dialog_publish (state=, entity=, peer=, callid=, initiator=, lifetime=43200, localtag=0x0, remotetag=0x0, localtarget=0x
Re: [SR-Users] Kamailio stops responding after 10 days or so
Hi, It happend again. I got the result from gdb. Here is the output from a bt indside gdb for each worker and one bt full for one of the workers. They are all stuck in futexlock.h My C debuging skills are a bit rusty so I hope one of you have some ideas. /Morten (gdb) bt #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaf17b2e0d in futex_get (t=, type=, ps=0x7fffafb7d7a0) at ../../mem/../futexlock.h:113 #2 publ_cback_func (t=, type=, ps=0x7fffafb7d7a0) at send_publish.c:272 #3 0x2acaeeb97815 in run_trans_callbacks_internal (cb_lst=0x2acaf3cfd260, type=256, trans=0x2acaf3cfd1f0, params=0x7fffafb7d7a0) at t_hooks.c:290 #4 0x2acaeeb97a6e in run_trans_callbacks (type=0, trans=0x2, req=0x0, rpl=0x246, code=0) at t_hooks.c:317 #5 0x2acaeebbfcb3 in local_reply (t=0x2acaf3cfd1f0, p_msg=, branch=0, msg_status=, cancel_bitmap=) at t_reply.c:1847 #6 0x2acaeebc29e8 in reply_received (p_msg=0x8d36c8) at t_reply.c:2133 #7 0x0044780e in forward_reply (msg=0x8d36c8) at forward.c:689 #8 0x0047ff02 in receive_msg ( buf=0x874a40 "SIP/2.0 200 OK\r\nVia: SIP/2.0/UDP 178.21.248.8;branch=z9hG4bKcd3a.d99d6403.0\r\nTo: sip:+380432571...@sip.uni-tel.dk;user=phone;tag=4e546d1f8c10c6f99af9c51d895c9c87-c9ca\r\nFrom: sip:+380432571...@sip.uni-"..., len=, rcv_info=0x7fffafb7dd20) at receive.c:257 #9 0x005067ab in udp_rcv_loop () at udp_server.c:520 #10 0x00455cdf in main_loop () at main.c:1447 #11 0x00456de2 in main (argc=, argv=0x7fffafb7dfe8) at main.c:2251 (gdb) bt #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaeeb7ea32 in futex_get (i=) at ../../mem/../futexlock.h:100 #2 _lock (i=) at lock.h:98 #3 lock_hash (i=) at h_table.c:98 #4 0x2acaeeba1b99 in t_lookup_request (p_msg=0x8d36c8, leave_new_locked=0, cancel=0x7fffafb7d5e0) at t_lookup.c:548 #5 0x2acaeeba3532 in t_check_msg (p_msg=0x8d36c8, param_branch=) at t_lookup.c:1104 #6 0x2acaeebab7b1 in t_check_trans (msg=0x2acaf1f4b2e8, foo=, bar=0x2 ) at tm.c:1881 #7 0x0041345c in do_action (h=0x7fffafb7dab0, a=0x8a6f98, msg=0x8d36c8) at action.c:860 #8 0x00415d2b in run_actions (h=0x7fffafb7dab0, a=0x89f8d8, msg=0x8d36c8) at action.c:1293 #9 0x00416084 in run_top_route (a=0x89f8d8, msg=0x8d36c8, c=) at action.c:1341 #10 0x0047ff4c in receive_msg ( buf=0x874a40 "INVITE sip:20322595@178.21.248.8:5060;transport=udp SIP/2.0\r\nRecord-Route: \r\nVia: SIP/2.0/UDP 178.21.248.20;branch=z9hG4bKdd3a.3d0516e1.0\r\nVia: SIP/2.0/UDP 81.27."..., len=, rcv_info=0x7fffafb7dd20) at receive.c:196 #11 0x005067ab in udp_rcv_loop () at udp_server.c:520 #12 0x00455cdf in main_loop () at main.c:1447 #13 0x00456de2 in main (argc=, argv=0x7fffafb7dfe8) at main.c:2251 #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaeeb7ea32 in futex_get (i=) at ../../mem/../futexlock.h:100 #2 _lock (i=) at lock.h:98 #3 lock_hash (i=) at h_table.c:98 #4 0x2acaeeba1b99 in t_lookup_request (p_msg=0x8d36c8, leave_new_locked=0, cancel=0x7fffafb7d5e0) at t_lookup.c:548 #5 0x2acaeeba3532 in t_check_msg (p_msg=0x8d36c8, param_branch=) at t_lookup.c:1104 #6 0x2acaeebab7b1 in t_check_trans (msg=0x2acaf1f4b2e8, foo=, bar=0x2 ) at tm.c:1881 #7 0x0041345c in do_action (h=0x7fffafb7dab0, a=0x8a6f98, msg=0x8d36c8) at action.c:860 #8 0x00415d2b in run_actions (h=0x7fffafb7dab0, a=0x89f8d8, msg=0x8d36c8) at action.c:1293 #9 0x00416084 in run_top_route (a=0x89f8d8, msg=0x8d36c8, c=) at action.c:1341 #10 0x0047ff4c in receive_msg ( buf=0x874a40 "INVITE sip:20322595@178.21.248.8:5060;transport=udp SIP/2.0\r\nRecord-Route: \r\nVia: SIP/2.0/UDP 178.21.248.20;branch=z9hG4bKdd3a.3d0516e1.0\r\nVia: SIP/2.0/UDP 81.27."..., len=, rcv_info=0x7fffafb7dd20) at receive.c:196 #11 0x005067ab in udp_rcv_loop () at udp_server.c:520 #12 0x00455cdf in main_loop () at main.c:1447 #13 0x00456de2 in main (argc=, argv=0x7fffafb7dfe8) at main.c:2251 (gdb) bt #0 0x00329b6d0a19 in syscall () from /lib64/libc.so.6 #1 0x2acaeeb7e9fa in futex_get (i=) at ../../mem/../futexlock.h:113 #2 _lock (i=) at lock.h:98 #3 lock_hash (i=) at h_table.c:98 #4 0x2acaeebca0e3 in t_uac_prepare (uac_r=0x7fffafb7c270, dst_req=0x7fffafb7c060, dst_cell=0x7fffafb7c058) at uac.c:319 #5 0x2acaeebcb4ae in t_uac_with_ids (uac_r=0x2acaf1f4b2e8, ret_index=0x0, ret_label=0x0) at uac.c:531 #6 0x2acaeebcc9c7 in request (uac_r=0x7fffafb7c270, ruri=0x11, to=0x1b, from=0x8e0fe0, next_hop=0x2acaf19c01c0) at uac.c:778 #7 0x2acaf17af3ff in send_publish (publ=) at send_publish.c:556 #8 0x2acaf19c4f67 in dialog_publish (state=, entity=, peer=, callid=, initiator=, lifetime=43200, localtag=0x0, remotetag=0x0, localtarget=0x0, remotetarget=0x0) at dialog_publish.c:349 #9 0x2acaf19c6880 in __dialog_created (dlg=0x2acaf575eb38, type=, _para
Re: [SR-Users] Kamailio stops responding after 10 days or so
On 04/09/2011 11:09 AM, Morten Isaksen wrote: Hi Marius, Hello, Thanks for the update, but unfortunately I can't spot anything wrong. As Daniel noted, please send a trace of the worker process (you can also select the worker's pid from the log file)Attack with GDB for several times and issue a bt (or bt full) command. If the traces are different, this will help. Also do a ngrep to see that indeed you have traffic and K is not responding at all. Marius The carrier avp is always set in route[0]. My failure_route looks like this. xlog("L_WARN", "Failure route - M=$rm RURI=$ru F=$fu T=$tu IP=$si ID=$ci\n"); if (t_check_status("408|404|5[0-9][0-9]|6[0-9][0-9]")&& !t_check_status("503")) { revert_uri(); if (!cr_next_domain("$avp(s:carrier)", "$avp(s:domain)", "$rU", "$avp(s:host)", "$T_reply_code", "$avp(s:domain)")) { xlog("L_ERR", "cr_next_domain failed\n"); exit; } if (!cr_route("$avp(s:carrier)", "$avp(s:domain)", "$rU", "$rU", "call_id")) { xlog("L_ERR", "cr_route failed\n"); exit; } $avp(s:host)= $rd; t_on_failure("COREROUTE"); append_branch(); xlog("L_WARN", "Outgoing M=$rm RURI=$ru F=$fu T=$tu IP=$si ID=$ci\n"); xlog("$si -> $rd"); if (!t_relay()) { xlog("L_ERR", "t_relay failed\n"); exit; }; } ___ SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
Re: [SR-Users] Kamailio stops responding after 10 days or so
Hi Marius, The carrier avp is always set in route[0]. My failure_route looks like this. xlog("L_WARN", "Failure route - M=$rm RURI=$ru F=$fu T=$tu IP=$si ID=$ci\n"); if (t_check_status("408|404|5[0-9][0-9]|6[0-9][0-9]") && !t_check_status("503")) { revert_uri(); if (!cr_next_domain("$avp(s:carrier)", "$avp(s:domain)", "$rU", "$avp(s:host)", "$T_reply_code", "$avp(s:domain)")) { xlog("L_ERR", "cr_next_domain failed\n"); exit; } if (!cr_route("$avp(s:carrier)", "$avp(s:domain)", "$rU", "$rU", "call_id")) { xlog("L_ERR", "cr_route failed\n"); exit; } $avp(s:host)= $rd; t_on_failure("COREROUTE"); append_branch(); xlog("L_WARN", "Outgoing M=$rm RURI=$ru F=$fu T=$tu IP=$si ID=$ci\n"); xlog("$si -> $rd"); if (!t_relay()) { xlog("L_ERR", "t_relay failed\n"); exit; }; } On Fri, Apr 8, 2011 at 11:08 AM, marius zbihlei wrote: > On 04/07/2011 10:02 PM, Morten Isaksen wrote: >> >> Hi! >> >> Kamailio 3.0.3. >> >> I have a strange problem with one of our Kamailio servers. This one is >> used for routing (with carrierroute) and to send presence information >> (with pua module) >> >> Once every 10 day or so I get this error and then Kamailio stops >> responding to any SIP packets. >> >> Apr 6 08:05:48 sip-core-1 /usr/local/sbin/kamailio[9186]: WARNING: >>
Re: [SR-Users] Kamailio stops responding after 10 days or so
On 04/07/2011 10:02 PM, Morten Isaksen wrote: Hi! Kamailio 3.0.3. I have a strange problem with one of our Kamailio servers. This one is used for routing (with carrierroute) and to send presence information (with pua module) Once every 10 day or so I get this error and then Kamailio stops responding to any SIP packets. Apr 6 08:05:48 sip-core-1 /usr/local/sbin/kamailio[9186]: WARNING:
Re: [SR-Users] Kamailio stops responding after 10 days or so
Hi, No, normal CPU usage. Thanks for the sugestion - I will try that next time. /Morten On Thu, Apr 7, 2011 at 10:52 PM, Daniel-Constantin Mierla wrote: > Hello, > > do you get high CPU usage by kamailio? > > What you can do is to attach with gdb to kamailio processes and see what > they are doing: > > gdb /path/to/kamailio pid_of_a_kamailio_process > bt > > You should attach to the sip worker processes - you can find the type of > processes with 'kamctl ps'. > > Cheers, > Daniel > > On 4/7/11 9:02 PM, Morten Isaksen wrote: >> >> Hi! >> >> Kamailio 3.0.3. >> >> I have a strange problem with one of our Kamailio servers. This one is >> used for routing (with carrierroute) and to send presence information >> (with pua module) >> >> Once every 10 day or so I get this error and then Kamailio stops >> responding to any SIP packets. >> >> Apr 6 08:05:48 sip-core-1 /usr/local/sbin/kamailio[9186]: WARNING: >>
Re: [SR-Users] Kamailio stops responding after 10 days or so
Hello, do you get high CPU usage by kamailio? What you can do is to attach with gdb to kamailio processes and see what they are doing: gdb /path/to/kamailio pid_of_a_kamailio_process bt You should attach to the sip worker processes - you can find the type of processes with 'kamctl ps'. Cheers, Daniel On 4/7/11 9:02 PM, Morten Isaksen wrote: Hi! Kamailio 3.0.3. I have a strange problem with one of our Kamailio servers. This one is used for routing (with carrierroute) and to send presence information (with pua module) Once every 10 day or so I get this error and then Kamailio stops responding to any SIP packets. Apr 6 08:05:48 sip-core-1 /usr/local/sbin/kamailio[9186]: WARNING: