Hi everyone, Yesterday we experienced some bad behavior with opensips in a couple of our production servers. What we saw is that all TLS/TCP connections were closing and registrations were failing until we restarted opensips. We can't understand why this happened. CPU metrics seem fine.
We are using opensips 2.1. By looking at the logs we see a lot of errors which some of them standout. Please let me know if any of these logs can say something about the cause/trigger for the event. n 16 17:07:45 server 1 /usr/sbin/opensips[12990]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:07:45 server 1 /usr/sbin/opensips[12990]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 216 Jan 16 17:07:45 server 1 /usr/sbin/opensips[13006]: CRITICAL:core:io_watch_add: #012>>> [TCP_main] BUG trying to overwrite entry 4670 in the hash(4670, 19, 0x7f333fbc07c8,1) with (4670, 19, 0x7f3340282170,1)#012#012It seems you have hit a programming bug.#012Please help us make OpenSIPS better by reporting it at https://github.com/OpenSIPS/opensips/issues#012 Jan 16 17:07:45 server 1 /usr/sbin/opensips[13006]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:07:45 server 1 /usr/sbin/opensips[13006]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 4671 Jan 16 17:07:45 server 1 /usr/sbin/opensips[13006]: INFO:core:send2child: no free tcp receiver, connection passed to the least busy one (190) Jan 16 17:11:39 server 1 /usr/sbin/opensips[12995]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:39 server 1 /usr/sbin/opensips[12995]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 94 Jan 16 17:11:39 server 1 /usr/sbin/opensips[12957]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled for 411776060 ms (now 411776160 ms), it may overlap.. Jan 16 17:11:39 server 1 /usr/sbin/opensips[13006]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:39 server 1 /usr/sbin/opensips[13006]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 400 Jan 16 17:11:39 server 1 /usr/sbin/opensips[13006]: INFO:core:send2child: no free tcp receiver, connection passed to the least busy one (135) Jan 16 17:11:39 server 1 /usr/sbin/opensips[12957]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled for 411776160 ms (now 411776260 ms), it may overlap.. Jan 16 17:11:39 server 1 /usr/sbin/opensips[12957]: WARNING:core:timer_ticker: timer task <tm-timer> already scheduled for 411772300 ms (now 411776260 ms), it may overlap.. Jan 16 17:11:39 server 1 /usr/sbin/opensips[13006]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:39 server 1 /usr/sbin/opensips[13006]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 406 Jan 16 17:11:39 server 1 /usr/sbin/opensips[13006]: INFO:core:send2child: no free tcp receiver, connection passed to the least busy one (135) Jan 16 17:11:39 server 1 /usr/sbin/opensips[12957]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled for 411776260 ms (now 411776360 ms), it may overlap.. Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:core:tcp_connect_blocking: timeout 999977 ms elapsed from 1000000 s Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:proto_tls:tls_sync_connect: tcp_blocking_connect failed Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:proto_tls:proto_tls_send: connect failed Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:tm:msg_send: send() for proto 3 failed Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: INFO:core:sig_usr: signal 13 received Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:proto_tls:tls_write: TLS connection to <ip>:<port> write failed Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:proto_tls:tls_write: TLS write error: Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:proto_tls:tls_blocking_write: TLS failed to send data Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:proto_tls:proto_tls_send: failed to send Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: CRITICAL:core:send_all: send on -1 failed: Bad file descriptor Jan 16 17:11:39 server 1 /usr/sbin/opensips[12960]: ERROR:core:tcpconn_release: send_all failed gg an 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:core:tcp_connect_blocking: timeout 999197 ms elapsed from 1000000 s Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:proto_tls:tls_sync_connect: tcp_blocking_connect failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:proto_tls:proto_tls_send: connect failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:tm:msg_send: send() for proto 3 failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 47 Jan 16 17:11:46 server 1 /usr/sbin/opensips[12957]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled for 411783490 ms (now 411783590 ms), it may overlap.. Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 1414 Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:send2child: no free tcp receiver, connection passed to the least busy one (141) Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:core:tcp_connect_blocking: poll error: flags 28 - 4 8 16 32 Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:core:tcp_connect_blocking: failed to retrieve SO_ERROR [server=<ip>:<port>] (111) Connection refused Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:proto_tls:tls_sync_connect: tcp_blocking_connect failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:proto_tls:proto_tls_send: connect failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: ERROR:tm:msg_send: send() for proto 3 failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:46 server 1 /usr/sbin/opensips[12977]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 47 Jan 16 17:11:46 server 1 /usr/sbin/opensips[12997]: ERROR:core:tcp_connect_blocking: timeout 999484 ms elapsed from 1000000 s Jan 16 17:11:46 server 1 /usr/sbin/opensips[12997]: ERROR:proto_tls:tls_sync_connect: tcp_blocking_connect failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12997]: ERROR:proto_tls:proto_tls_send: connect failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12997]: ERROR:tm:msg_send: send() for proto 3 failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12997]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:46 server 1 /usr/sbin/opensips[12997]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 85 Jan 16 17:11:46 server 1 /usr/sbin/opensips[12957]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled for 411783590 ms (now 411783690 ms), it may overlap.. Jan 16 17:11:46 server 1 /usr/sbin/opensips[12957]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled for 411783690 ms (now 411783790 ms), it may overlap.. Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 1415 Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:send2child: no free tcp receiver, connection passed to the least busy one (141) Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 1419 Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:send2child: no free tcp receiver, connection passed to the least busy one (141) Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:probe_max_sock_buff: using snd buffer of 32768 kb Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:init_sock_keepalive: TCP keepalive enabled on socket 1420 Jan 16 17:11:46 server 1 /usr/sbin/opensips[13006]: INFO:core:send2child: no free tcp receiver, connection passed to the least busy one (141) Jan 16 17:11:46 server 1 /usr/sbin/opensips[12957]: WARNING:core:utimer_ticker: utimer task <tm-utimer> already scheduled for 411783790 ms (now 411783890 ms), it may overlap.. Jan 16 17:11:46 server 1 /usr/sbin/opensips[12987]: ERROR:core:tcp_connect_blocking: timeout 999053 ms elapsed from 1000000 s Jan 16 17:11:46 server 1 /usr/sbin/opensips[12987]: ERROR:proto_tls:tls_sync_connect: tcp_blocking_connect failed Jan 16 17:11:46 server 1 /usr/sbin/opensips[12987]: ERROR:proto_tls:proto_tls_send: connect failed
_______________________________________________ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users