Great everyone!
It often happens to me that the service completely fails and refuses to service 
calls. For example, yesterday there was such a situation. I looked at the logs 
and at the beginning of the incident I noticed this strange entry:

kamailio[283008]: WARNING: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} 
<core> [core/tcp_main.c:1297]: find_listening_sock_info(): binding to source 
address 109.68.232.40:15061 failed: Address already in use [98]
kamailio[283008]: WARNING: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} 
<core> [core/tcp_main.c:1441]: tcp_do_connect(): 10.241.60.46:15061: could not 
find corresponding listening socket
kamailio[283008]: ERROR: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} <core> 
[core/tcp_main.c:3078]: tcpconn_1st_send(): connect 10.241.60.46:15061 failed 
(RST) Connection refused
kamailio[283008]: ERROR: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} <core> 
[core/tcp_main.c:3087]: tcpconn_1st_send(): 10.241.60.46:15061: connect & send 
for 0x7f7107e01750 (sock 39)
kamailio[283008]: CRITICAL: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} 
<core> [core/forward.h:290]: msg_send_buffer(): tcp_send failed
kamailio[283008]: ERROR: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} sl 
[sl_funcs.c:415]: sl_reply_error(): stateless error reply used: Unfortunately 
error on sending to next hop occurred (477/SL)

And after several such entries this begins:
/usr/local/sbin/kamailio[283069]: ERROR: <core> [core/tcp_main.c:4787]: 
tcpconn_main_timeout(): connect 31.173.85.36:10588 failed (timeout)
/usr/local/sbin/kamailio[283069]: ERROR: <core> [core/tcp_main.c:4787]: 
tcpconn_main_timeout(): connect 95.24.163.132:48075 failed (timeout)
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668011 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668304 (state: 0) - closing /usr/local/sbin/kamailio[282828]: CRITICAL: 
<core> [core/tcp_main.c:5526]: tcp_timer_check_connections(): message 
processing timeout on connection id: 22668309 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668316 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668192 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668228 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668236 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668427 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668428 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668429 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668430 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668431 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668432 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668433 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668434 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668435 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]: 
tcp_timer_check_connections(): message processing timeout on connection id: 
22668436 (state: 3) - closing

Such recordings go on for a very long time (5 minutes) and begin like this:
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 20, socket 273: queue full, 286 requests queued (total 
handled 2421746)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 29, socket 291: queue full, 286 requests queued (total 
handled 2419460)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 4, socket 241: queue full, 287 requests queued (total 
handled 2451838)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 5, socket 243: queue full, 287 requests queued (total 
handled 2445182)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 18, socket 269: queue full, 287 requests queued (total 
handled 2422113)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 20, socket 273: queue full, 287 requests queued (total 
handled 2421747)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 27, socket 287: queue full, 287 requests queued (total 
handled 2419986)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 29, socket 291: queue full, 287 requests queued (total 
handled 2419461)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]: 
send2child(): tcp child 30, socket 293: queue full, 287 requests queued (total 
handled 2419979)

Also, my eyes are distracted by such records and it’s difficult to understand 
the logs:
1) kamailio[282798]: CRITICAL: {2 12704 INVITE 
4cb232e6-9b5a-4a5b-aa0e-1ed73539039f} tm [../../core/forward.h:290]: 
msg_send_buffer(): tcp_send failed
2) /usr/local/sbin/kamailio[282970]: ERROR: sl [sl.c:285]: send_reply(): failed 
to reply stateful (tm)
3) kamailio[283053]: CRITICAL: <core> [core/tcp_read.c:267]: tcp_read_data(): 
error reading: Connection reset by peer (104) ([217.66.156.172]:11604 ->
     kamailio[283053]: CRITICAL: <core> [core/tcp_read.c:270]: tcp_read_data(): 
-> [109.68.232.40]:15061)
     kamailio[283053]: CRITICAL: <core> [core/tcp_read.c:1523]: tcp_read_req(): 
ERROR: tcp_read_req: error reading - c: 0x7f71215bb610 r: 0x7f71215bb738 (-1)
     kamailio[283007]: CRITICAL: <core> [core/tcp_read.c:267]: tcp_read_data(): 
error reading: Connection timed out (110) ([188.170.204.16]:52148 ->
     kamailio[283007]: CRITICAL: <core> [core/tcp_read.c:270]: tcp_read_data(): 
-> [109.68.232.40]:15061)
     kamailio[283007]: CRITICAL: <core> [core/tcp_read.c:1523]: tcp_read_req(): 
ERROR: tcp_read_req: error reading - c: 0x7f711b638fa0 r: 0x7f711b6390c8 (-1)
     kamailio[283027]: ERROR: {1 9746 REGISTER 
8e6af58d-d4ca-4dfe-a8f6-d3c84685c845} tsilo [ts_append.c:121]: ts_append_to(): 
transaction [23428:1443020265] not found
4) kamailio[282820]: CRITICAL: {1 2 UPDATE 
6da65a87-3224-46a1-961f-8f9ee50348cc} tm [../../core/forward.h:290]: 
msg_send_buffer(): tcp_send failed
     kamailio[282820]: WARNING: {1 2 UPDATE 
6da65a87-3224-46a1-961f-8f9ee50348cc} tm [t_fwd.c:1581]: t_send_branch(): 
sending request on branch 0 failed
     kamailio[282820]: ERROR: {1 2 UPDATE 6da65a87-3224-46a1-961f-8f9ee50348cc} 
sl [sl_funcs.c:415]: sl_reply_error(): stateless error reply used: 
Unfortunately error on sending to next hop occurred (477/SL)

Once the queue is full, the system stops servicing new calls and freezes.

kamailio -v
version: kamailio 5.7.2 (x86_64/linux) 95629c
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, 
USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC, 
F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, 
USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, 
TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, 
BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 95629c 

OS: Debian 11, 64 cores, 125GB of DDR4 RAM; DB: MariaDB 10.5.15;

udp_children: 30; tcp_children: 34; TLS: YES; async_workers: 64.
Incoming calls are sent via push notifications( Federico Cabiddu method:
https://www.voztovoice.org/sites/default/files/KamilioWorld2015%20-Federico…).

NetBridging(for SIP and RTPEngine).
ims_charging for billing (integration with our billing system using the Diameter
protocol).

As soon as the service starts responding with 503, it does not recover on its 
own, you have to restart. 
My server is powerful enough for the number of users of my service (what I mean 
is that there is no problem with resources). For example, 64 cores, 125GB of 
DDR4 RAM, 2TB of disk, of which 1TB SSD for the database, 10Gigabit channel, 
this is only a server for Kamailio for Rtpengines separate servers. 

Please help me figure out what causes a service failure!?
__________________________________________________________
Kamailio - Users Mailing List - Non Commercial Discussions
To unsubscribe send an email to sr-users-le...@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the 
sender!
Edit mailing list options or unsubscribe:

Reply via email to