> I'll also give a try with 6.0.1 later.
Meanwhile, I've tested 5.8.6 and 6.0.1 and both show the same issue.

Am Do., 20. März 2025 um 10:21 Uhr schrieb Mathias Schneuwly <
[email protected]>:

> Hi Henning
>
> > Maybe also check the actual limits by “cat /proc/$kamailio-pid/limits”
> Not 100% if its related, but if you are looking to scale this would be for
> sure required.
> The limit was
> Max open files            1024                 524288               files
> and I changed it now to
> Max open files            65536                65536                files
>
> Are there some recommendations for the limit? Would it make sense to
> change that in the provided systemd file if you say that 1024 is not that
> much?
>
> But anyway, it fails with 1024 and with 65536.
>
> I've also updated to 5.8.5 and it shows the same issue. I'll also give a
> try with 6.0.1 later.
>
> > About the tls_thread_mode – you should use mode 2 if you are running
> with openssl 3.x, otherwise you probably will observe random crashes.
> We are running OpenSSL 3.2.3 and tls_thread_mode is set to 2. But it still
> blocks.
>
> Best regards
> Mathias
>
> Am Do., 20. März 2025 um 08:42 Uhr schrieb Henning Westerholt <
> [email protected]>:
>
>> Hello Mathias,
>>
>>
>>
>> 1024 open files are not that much, you should increase it for the
>> kamailio user. Maybe also check the actual limits by “cat
>> /proc/$kamailio-pid/limits”
>>
>> Not 100% if its related, but if you are looking to scale this would be
>> for sure required.
>>
>>
>>
>> About the tls_thread_mode – you should use mode 2 if you are running with
>> openssl 3.x, otherwise you probably will observe random crashes.
>>
>>
>>
>> If you are seeing the mentioned problem with 15 phones, this is clearly a
>> problem and should be further investigated.
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Henning
>>
>>
>>
>>
>>
>> *From:* Mathias Schneuwly via sr-users <[email protected]>
>> *Sent:* Donnerstag, 20. März 2025 08:07
>> *To:* [email protected]
>> *Cc:* Mathias Schneuwly <[email protected]>
>> *Subject:* [SR-Users] Re: send_fd(): sendmsg failed sending 55 on 52:
>> Too many references: cannot splice (109)
>>
>>
>>
>> Hi
>>
>>
>>
>> Thanks for the reply. We are using the following limits and I guess they
>> are the default values
>>
>> real-time non-blocking time  (microseconds, -R) unlimited
>> core file size              (blocks, -c) unlimited
>> data seg size               (kbytes, -d) unlimited
>> scheduling priority                 (-e) 0
>> file size                   (blocks, -f) unlimited
>> pending signals                     (-i) 15666
>> max locked memory           (kbytes, -l) 8192
>> max memory size             (kbytes, -m) unlimited
>> open files                          (-n) 1024
>> pipe size                (512 bytes, -p) 8
>> POSIX message queues         (bytes, -q) 819200
>> real-time priority                  (-r) 0
>> stack size                  (kbytes, -s) 8192
>> cpu time                   (seconds, -t) unlimited
>> max user processes                  (-u) 15666
>> virtual memory              (kbytes, -v) unlimited
>> file locks                          (-x) unlimited
>>
>>
>>
>> But to me, it looks more like a deadlock somewhere around SSL_accept.
>> According to the stack trace generated with "kamctl trap", all 8 TCP
>> receiver processes are waiting in "#1  0x00007fa14ca04830 in
>> pthread_rwlock_wrlock () from /usr/lib/libc.so.6". I guess the message "Too
>> many references: cannot splice (109)" is just the consequence as the main
>> process is no longer able to dispatch jobs to the child workers.
>>
>>
>>
>> Meanwhile, I'm able to reproduce the issue. It requires a bit of brute
>> force, but it deadlocks reliably after a few seconds.
>>
>>
>>
>> I'm flooding Kamailio with register messages using sipexer. I'm running
>> the following command 3 times in parallel and always after a few seconds,
>> Kamailio is completely blocked.
>>
>> watch -n 0.1 "./sipexer -register -vl 3 -co -com -ex 3600 -fu "2"  -au
>> "2" -ap "test" -tk <path to key>.key -tc <path to cert>.cer -ti -sd -ru
>> sip:<domain> -cu sip:2@<domain> tls:<fqdn>:5061;sleep 0.1;./sipexer
>> -register -vl 3 -co -com -ex 0 -fu "2"  -au "2" -ap "test" -tk <path to
>> key>.key -tc <path to cert>.cer -ti -sd -ru sip:<domain> -cu
>> sip:2@<domain> tls:<fqdn>:5061"
>>
>>
>>
>> I played a bit with the parameters "modparam("tls", "init_mode", 1)" and
>> "tls_threads_mode = 2", but it didn't seem to change anything.
>>
>>
>>
>> Please have in mind that we usually do not stress Kamailio that much. We
>> have seen the issue a couple of times with around 5 to 15 SIP phones
>> registered. But in this usual setup, the problem does not occur reliably.
>>
>>
>>
>> Best regards
>>
>> Mathias
>>
>>
>>
>> Am Di., 18. März 2025 um 21:53 Uhr schrieb Henning Westerholt <
>> [email protected]>:
>>
>> Hello,
>>
>>
>>
>> check the ulimit settings for the kamailio user and general resource
>> usage on that system. In development branch there is also a feature present
>> to print the actual ulimits on startup.
>>
>>
>>
>> Cheers,
>>
>>
>>
>> Henning
>>
>>
>>
>> *From:* Mathias Schneuwly via sr-users <[email protected]>
>> *Sent:* Montag, 17. März 2025 13:41
>> *To:* [email protected]
>> *Cc:* Mathias Schneuwly <[email protected]>
>> *Subject:* [SR-Users] send_fd(): sendmsg failed sending 55 on 52: Too
>> many references: cannot splice (109)
>>
>>
>>
>> Hi
>>
>>
>>
>> We are using kamailio 5.8.4.
>>
>>
>>
>> Recently, we noticed, that after a few days, we see the following
>> messages from Kamailio `2025 Mar 17 10:22:11 ttel CRITICAL: <core>
>> [core/pass_fd.c:193]: send_fd(): sendmsg failed sending 55 on 52: Too many
>> references: cannot splice (109)`.
>>
>>
>>
>> When this happens, devices are no longer able to register. At the moment
>> we have no idea what might be wrong.
>>
>>
>>
>> Unfortunately, the internet does not reveal much information regarding
>> this issue. In one issue related to Kamailio (
>> https://lists.kamailio.org/mailman3/hyperkitty/list/[email protected]/thread/FKN25CGV27KA3GTS3ED3E7V3WKRZALRP/),
>> I found the command `kamctl trap` and when I execute that, I see one issue,
>> which might be related, but I'm just guessing...
>>
>>
>>
>> ```
>>
>> ---start 1782236 -----------------------------------------------------
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/usr/lib/libthread_db.so.1".
>> 0x00007fa14c9fb1c6 in ?? () from /usr/lib/libc.so.6
>> $1 = 26
>> $2 = {pid = 1782236, unix_sock = 36, idx = 0, status = 1, rank = 17, desc
>> = "tcp receiver (generic) child=0", '\000' <repeats 97 times>}
>> #0  0x00007fa14c9fb1c6 in ?? () from /usr/lib/libc.so.6
>> No symbol table info available.
>> #1  0x00007fa14ca04830 in pthread_rwlock_wrlock () from /usr/lib/libc.so.6
>> No symbol table info available.
>> #2  0x00007fa14ac20b49 in CRYPTO_THREAD_write_lock () from
>> /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #3  0x00007fa14adbd5da in ?? () from /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #4  0x00007fa14adbd9ca in ?? () from /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #5  0x00007fa14adbc907 in ?? () from /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #6  0x00007fa14adbd26c in ?? () from /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #7  0x00007fa14adbd68e in ?? () from /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #8  0x00007fa14abe75c6 in ?? () from /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #9  0x00007fa14abe840b in EVP_RAND_generate () from
>> /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #10 0x00007fa14acf9282 in RAND_bytes_ex () from /usr/lib/libcrypto.so.3
>> No symbol table info available.
>> #11 0x00007fa14b06cce8 in ?? () from /usr/lib/libssl.so.3
>> No symbol table info available.
>> #12 0x00007fa14b05bdf7 in ?? () from /usr/lib/libssl.so.3
>> No symbol table info available.
>> #13 0x00007fa14b0ef036 in tls_accept (c=c@entry=0x7fa1475c0dc0,
>> error=error@entry=0x7ffebd9443b8) at tls_server.c:457
>>         ret = <optimized out>
>>         ssl = warning: could not convert 'ssl_st' from the host encoding
>> (ANSI_X3.4-1968) to UTF-32.
>> This normally should not happen, please file a bug report.
>> 0x7fa147668bd0
>>         cert = <optimized out>
>>         tls_c = 0x7fa14768f490
>>         tls_log = <optimized out>
>>         __func__ = "tls_accept"
>> #14 0x00007fa14b0f7540 in tls_h_read_f (c=0x7fa1475c0dc0,
>> flags=0x7ffebd9647d0) at tls_server.c:1153
>>         r = 0x7fa1475c0ee8
>>         bytes_free = 16383
>>         bytes_read = <optimized out>
>>         read_size = <optimized out>
>>         ssl_error = 0
>>         ssl_read = 0
>>         ssl = 0x7fa147668bd0
>>         rd_buf =
>> "\026\003\003\004\016\v\000\004\n\000\004\a\000\004\0040\202\004\0000\202\002\350\240\003\002\001\002\002\002\026\0270\r\006\t*\206H\206\367\r\001\001\v\005\0000b1\v0\t\006\003U\004\006\023\002CH1\r0\v\006\003U\004\b\f\004Bern1\r0\v\006\003U\004\a\f\004Bern1\r0\v\006\003U\004\n\f\004RUAG1\0200\016\006\003U\004\v\f\aDefence1\0240\022\006\003U\004\003\f\versaimfs.ch0\036\027\r230302163519Z\027\r291230163519Z0`1\v0\t\006\003U\004\006\023\002CH1\r0\v\006\003"...
>>         wr_buf =
>> "\026\003\003\000A\002\000\000=\003\003\371\204\326\361\262-9\300\344gV\2316\373\235D\314\225u8\v}\215\254DOWNGRD\001\000\3000\000\000\025\377\001\000\001\000\000\v\000\004\003\000\001\002\000#\000\000\000\027\000\000\026\003\003\a\267\v\000\a\263\000\a\260\000\004\0040\202\004\0000\202\002\350\240\003\002\001\002\002\002\r>0\r\006\t*\206H\206\367\r\001\001\v\005\0000b1\v0\t\006\003U\004\006\023\002CH1\r0\v\006\003U\004\b\f\004Bern1\r0\v\006\003U\004\a\f\004Bern1\r0\v\006\003U\004\n\f\004RUAG1\0200\016\006\003U\004\v\f\aDefence1\0240\022\006"...
>>         rd = {buf = 0x7ffebd944480 "\026\003\003\004\016\v", pos = 1405,
>> used = 1405, size = 65536}
>>         wr = {buf = 0x7ffebd954480 "\026\003\003", pos = 0, used = 0,
>> size = 65536}
>>         tls_c = 0x7fa14768f490
>>         enc_rd_buf = <optimized out>
>>         n = 0
>>         flush_flags = 0
>>         err_src = <optimized out>
>>         ip_buf = '\000' <repeats 63 times>
>>         x = <optimized out>
>>         tls_dbg = <optimized out>
>>         __func__ = "tls_h_read_f"
>>         redo_read = <optimized out>
>>         continue_ssl_read = <optimized out>
>>         __llevel = <optimized out>
>>         __kld = <optimized out>
>>         __llevel = <optimized out>
>>         __kld = <optimized out>
>> #15 0x000055d2519871b4 in tcp_read_headers (c=c@entry=0x7fa1475c0dc0,
>> read_flags=read_flags@entry=0x7ffebd9647d0) at core/tcp_read.c:445
>>         bytes = <optimized out>
>>         remaining = <optimized out>
>>         p = <optimized out>
>>         r = 0x7fa1475c0ee8
>>         mc = <optimized out>
>>         body_len = <optimized out>
>>         tvnow = {tv_sec = 1742100857, tv_usec = 944923}
>>         tvdiff = <optimized out>
>>         mfline = <optimized out>
>>         mtransid = <optimized out>
>>         __func__ = "tcp_read_headers"
>> #16 0x000055d251989cef in tcp_read_req (con=0x7fa1475c0dc0,
>> bytes_read=bytes_read@entry=0x7ffebd9647c8, 
>> read_flags=read_flags@entry=0x7ffebd9647d0)
>> at core/tcp_read.c:1509
>>         bytes = <optimized out>
>>         total_bytes = 0
>>         resp = 1
>>         size = <optimized out>
>>         req = 0x7fa1475c0ee8
>>         dst = {send_sock = 0x55d251a91020, to = {s = {sa_family = 52024,
>> sa_data = "\257Q\322U\000\000\002\000\000\000\376\177\000"}, sin =
>> {sin_family = 52024, sin_port = 20911, sin_addr = {s_addr = 21970},
>> sin_zero = "\002\000\000\000\376\177\000"}, sin6 = {sin6_family = 52024,
>> sin6_port = 20911, sin6_flowinfo = 21970, sin6_addr = {__in6_u =
>> {__u6_addr8 =
>> "\002\000\000\000\376\177\000\000\030\000\000\000\000\000\000", __u6_addr16
>> = {2, 0, 32766, 0, 24, 0, 0, 0}, __u6_addr32 = {2, 32766, 24, 0}}},
>> sin6_scope_id = 0}, sas = {ss_family = 52024, __ss_padding =
>> "\257Q\322U\000\000\002\000\000\000\376\177\000\000\030", '\000' <repeats
>> 23 times>,
>> "\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\262+\226Q\001
>> \000\000\200,5L\241\177\000\000 \020\251Q\322U\000\000
>> \020\251Q\322U\000\000\200\303\257Q\322U\000\00009SG\241\177\000\000\330G\226\275\376\177\000\000\b\000\000\000\000\000\000",
>> __ss_align = 20}}, id = 1, send_flags = {f = 1, blst_imask = 0}, proto = 14
>> '\016', proto_pad0 = 0 '\000', proto_pad1 = 0}
>>         c = <optimized out>
>>         ret = <optimized out>
>>         again = <optimized out>
>>         __func__ = "tcp_read_req"
>> #17 0x000055d25198ebec in handle_io (fm=fm@entry=0x7fa14c352c80,
>> events=events@entry=1, idx=idx@entry=-1) at core/tcp_read.c:1906
>>         ret = 8
>>         n = 0
>>         read_flags = RD_CONN_SHORT_READ
>>         con = 0x7fa1475c0dc0
>>         s = 14
>>         resp = <optimized out>
>>         t = <optimized out>
>>         ee = 0x0
>>         __func__ = "handle_io"
>>         error = <optimized out>
>> #18 0x000055d25199469b in io_wait_loop_epoll (repeat=repeat@entry=0,
>> t=2, h=0x55d251c49d60 <io_w>) at core/io_wait.h:1075
>>         n = 1
>>         r = 0
>>         fm = 0x7fa14c352c80
>>         revents = 1
>>         __func__ = "io_wait_loop_epoll"
>> #19 0x000055d251994e87 in tcp_receive_loop (unix_sock=<optimized out>) at
>> core/tcp_read.c:2026
>>         __func__ = "tcp_receive_loop"
>> #20 0x000055d25197e72c in tcp_init_children 
>> (woneinit=woneinit@entry=0x7ffebd964d1c)
>> at core/tcp_main.c:5391
>>         r = <optimized out>
>>         i = <optimized out>
>>         reader_fd_1 = 39
>>         pid = <optimized out>
>>         si_desc = "tcp receiver
>> (generic)\000\000`M\226\275\376\177\000\000\001\000\000\000\000\000\000\000\210\035\223Q\322U\000\000
>> \020\251Q\322U\000\000\220\315\257Q\322U\000\000\220\242/L\241\177\000\000\025\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000%\020\251Q\322U\000\000HU\257Q\322U\000\000e\005\000\000\000\000\000\000
>> \020\251Q\322U\000\000\3750\235Q\322U\000"
>>         si = <optimized out>
>>         __func__ = "tcp_init_children"
>>         error = <optimized out>
>> #21 0x000055d2517c6c7e in main_loop () at main.c:1950
>>         i = <optimized out>
>>         pid = <optimized out>
>>         si = <optimized out>
>>         si_desc = "udp receiver child=7 
>> sock=10.0.90.1:5060\000\000\000\000\322U\000\000?\342\017K\241\177\000\000\022\000\000\000\377\377\377\377\200\006\265L\241\177\000\000\253m\237L\241\177\000\000\361|\003\000\000\000\000\000\t\000\000\000\000\000\000\000Sat
>> Mar 15 05:25:32 2025\n\000\264L\241\177\000"
>>         nrprocs = <optimized out>
>>         woneinit = 1
>>         __func__ = "main_loop"
>>         error = <optimized out>
>> #22 0x000055d2517bc151 in main (argc=<optimized out>, argv=<optimized
>> out>) at main.c:3256
>>         cfg_stream = <optimized out>
>>         c = <optimized out>
>>         r = <optimized out>
>>         tmp = 0x7ffebd966e7f ""
>>         tmp_len = 32673
>>         port = 5060
>>         proto = 0
>>         aproto = 0
>>         ahost = 0x0
>>         aport = 0
>>         options = 0x55d251aa74b8
>> ":f:cm:M:dVIhEeb:B:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:"
>>         ret = -1
>>         seed = 3209119183
>>         rfd = <optimized out>
>>         debug_save = <optimized out>
>>         debug_flag = <optimized out>
>>         dont_fork_cnt = 0
>>         n_lst = <optimized out>
>>         p = <optimized out>
>>         st = {st_dev = 24, st_ino = 46237, st_nlink = 2, st_mode = 16888,
>> st_uid = 899, st_gid = 899, __pad0 = 0, st_rdev = 0, st_size = 40,
>> st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1742012732, tv_nsec =
>> 217328715}, st_mtim = {tv_sec = 1742012732, tv_nsec = 217328715}, st_ctim =
>> {tv_sec = 1742012732, tv_nsec = 217328715}, __glibc_reserved = {0, 0, 0}}
>>         l1 = <optimized out>
>>         tbuf = '\000' <repeats 32 times>,
>> "\030\202\237\275\376\177\000\000@\002\000\000@\003\000\000\001", '\000'
>> <repeats 23 times>, "@\003\000\000@\003\000\000@\003\000\000@
>> \003\000\000@\003\000\000@\003\000\000@\003\000\000@\003\000\000@
>> \003\000\000@\003\000\000@\003\000\000@\003\000\000@\003\000\000@
>> \003\000\000@\003\000\000@\003\000\000@\003\000\000@\003\000\000@
>> \003\000\000@\003\000\000@\003\000\000@\003", '\000' <repeats 11 times>,
>> "\001", '\000' <repeats 118 times>...
>>         option_index = 12
>>         long_options = {{name = 0x55d251a9230b "help", has_arg = 0, flag
>> = 0x0, val = 104}, {name = 0x55d251a937a4 "version", has_arg = 0, flag =
>> 0x0, val = 118}, {name = 0x55d251a9826a "alias", has_arg = 1, flag = 0x0,
>> val = 1024}, {name = 0x55d251a92310 "subst", has_arg = 1, flag = 0x0, val =
>> 1025}, {name = 0x55d251a92316 "substdef", has_arg = 1, flag = 0x0, val =
>> 1026}, {name = 0x55d251a9231f "substdefs", has_arg = 1, flag = 0x0, val =
>> 1027}, {name = 0x55d251a92329 "server-id", has_arg = 1, flag = 0x0, val =
>> 1028}, {name = 0x55d251a92333 "loadmodule", has_arg = 1, flag = 0x0, val =
>> 1029}, {name = 0x55d251a9233e "modparam", has_arg = 1, flag = 0x0, val =
>> 1030}, {name = 0x55d251a92347 "log-engine", has_arg = 1, flag = 0x0, val =
>> 1031}, {name = 0x55d251a938c1 "debug", has_arg = 1, flag = 0x0, val =
>> 1032}, {name = 0x55d251a92352 "cfg-print", has_arg = 0, flag = 0x0, val =
>> 1033}, {name = 0x55d251a9235c "atexit", has_arg = 1, flag = 0x0, val =
>> 1034}, {name = 0x55d251a92363 "all-errors", has_arg = 0, flag = 0x0, val =
>> 1035}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}}
>>         __func__ = "main"
>> [Inferior 1 (process 1782236) detached]
>> ---end 1782236 -------------------------------------------------------
>>
>> ```
>>
>>
>>
>> Does anybody have an idea what this could be? I never saw that with
>> previous versions, and now we can see it on several instances, but we are
>> yet unable to reproduce.
>>
>>
>>
>> Not sure if this might be related, but we also see many messages like
>> `Mar 17 13:39:03 ttel /usr/sbin/kamailio[2882127]: CRITICAL: <core>
>> [core/tcp_main.c:5558]: tcp_timer_check_connections(): message processing
>> timeout on connection id: 402 (state: 0) - closing`
>>
>>
>>
>> Best regards
>>
>> Mathias
>>
>>
__________________________________________________________
Kamailio - Users Mailing List - Non Commercial Discussions -- 
[email protected]
To unsubscribe send an email to [email protected]
Important: keep the mailing list in the recipients, do not reply only to the 
sender!

Reply via email to