Re: [sr-dev] [kamailio/kamailio] Infinite loop inside htable module during dmq synch (#1863)
@charlesrchance do you need more tests before merging this fix? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/kamailio/kamailio/issues/1863#issuecomment-467842815___ Kamailio (SER) - Development Mailing List sr-dev@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
Re: [sr-dev] [kamailio/kamailio] Infinite loop inside htable module during dmq synch (#1863)
Hi @charlesrchance , I tried but couldn't reproduce the issue, I'm wondering how it happened -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/kamailio/kamailio/issues/1863#issuecomment-466984410___ Kamailio (SER) - Development Mailing List sr-dev@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
Re: [sr-dev] [kamailio/kamailio] Infinite loop inside htable module during dmq synch (#1863)
Thank you @charlesrchance , I did some tests with this setup: kamailio.cfg (meaningful lines): ``` fork=yes children=1 tcp_connection_lifetime=3605 pv_buffer_size=8192 # - dmq params - modparam("dmq", "server_address", DMQ_SERVER_ADDRESS) modparam("dmq", "notification_address", DMQ_NOTIFICATION_ADDRESS) modparam("dmq", "multi_notify", 1) modparam("dmq", "num_workers", 1) modparam("dmq", "ping_interval", 15) modparam("dmq", "worker_usleep", 1000) # - htable params - modparam("htable", "enable_dmq", 1) modparam("htable", "dmq_init_sync", 1) modparam("htable", "htable", "ht=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog modparam("htable", "htable", "ht1=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog modparam("htable", "htable", "ht2=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog modparam("htable", "htable", "ht3=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog #!define ONEK "really 1 k chars, believe me :)" event_route[htable:mod-init] { $var(name) = POD_NAME + "\n"; xlog("L_ALERT", "$var(name)"); if(POD_NAME == "kama-0") { $var(count) = 0; while($var(count) < 99) { $sht(ht=>$var(count)) = ONEK; $sht(ht1=>$var(count)) = ONEK; $sht(ht2=>$var(count)) = ONEK; $sht(ht3=>$var(count)) = ONEK; $var(count) = $var(count)+1; } } } request_route { if ($rm == "KDMQ"){ dmq_handle_message(); } exit; } ``` Started kama-0 which has now 4 htables of ~99K size each Started 10 kubernetes pods and launched kamailio 100 times with a timeout of 3 seconds on each pod So we have roughly 1000 kamailios trying to get these htables from kama-0 I didn't see any dangerous CPU spike and the loop doesn't happen anymore. There's something I'm worried of though: the memory of the DMQ worker (measured from top), which usually stays around 0.1% is now stable at 1.4% and it's not going down again I fear there's a memory leak somewhere but I'm not sure where, I had some doubts while debugging the loop issue about how the json_t structures are freed but it could be caused by me not knowing well the code; can you give us any hint to help you understand this issue better? Thanks -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/kamailio/kamailio/issues/1863#issuecomment-466341124___ Kamailio (SER) - Development Mailing List sr-dev@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
[sr-dev] [kamailio/kamailio] Infinite loop inside htable module during dmq synch (#1863)
### Description When restarting kamailio nodes in our infrastructure we noticed that under traffic some nodes started using the 100% of the CPU, with the precious help of @giavac we were able to track down the issue to an infinite loop inside the htable module when synchronizing somewhat big (60K) htables via dmq ### Troubleshooting Reproduction Have 1 kamailio instance with a 60K+ htable and start a new instance, the first instance will try to send the whole table to the new instance and it will enter an infinite loop which consumes 100% of the CPU This is caused by a double call to **ht_dmq_cell_group_flush** which creates a circular list on the json structure hierarchy, the second call happens in this block of code (hence why it's required a 60K htable): https://github.com/kamailio/kamailio/blob/5.2/src/modules/htable/ht_dmq.c#L509 When this happens **ht_dmq_cell_group_flush** try to add **ht_dmq_jdoc_cell_group.jdoc_cells** inside **ht_dmq_jdoc_cell_group.jdoc->root** but this root already has **json_cells** as its child so when **srjson_AddItemToObject** is called (and in turn **srjson_AddItemToArray**) it gets appended as the child of itself: https://github.com/kamailio/kamailio/blob/master/src/lib/srutils/srjson.c#L813 This circular structure then causes a loop when calling **srjson_PrintUnformatted** because in the **print_object** function the circular list is looped over: https://github.com/kamailio/kamailio/blob/master/src/lib/srutils/srjson.c#L679 ### Possible Solutions One possible solution could be to destroy and init again the **ht_dmq_jdoc_cell_group** structure after calling the flush: ``` if (ht_dmq_jdoc_cell_group.size >= dmq_cell_group_max_size) { LM_DBG("sending group count[%d]size[%d]\n", ht_dmq_jdoc_cell_group.count, ht_dmq_jdoc_cell_group.size); if (ht_dmq_cell_group_flush(node) < 0) { ht_slot_unlock(ht, i); goto error; } ht_dmq_cell_group_destroy(); ht_dmq_cell_group_init(); } ``` But we are not sure about the performance implications. ### Additional Information `# kamailio -v version: kamailio 5.2.1 (x86_64/linux) 44e488 flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144 MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: 44e488 compiled on 11:52:58 Feb 21 2019 with gcc 5.4.0 ` * **Operating System**: ubuntu:xenial docker container `# uname -a Linux kama-0 4.4.0-135-generic #161-Ubuntu SMP Mon Aug 27 10:45:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/kamailio/kamailio/issues/1863___ Kamailio (SER) - Development Mailing List sr-dev@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
[sr-dev] [kamailio/kamailio] KEMI: add htable's sht_rm_* functions (#1614)
Hi, currently htable module's sht_rm_* functions are not exported to KEMI, could anyone please export them to KEMI? Thanks, Enrico -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/kamailio/kamailio/issues/1614___ Kamailio (SER) - Development Mailing List sr-dev@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
Re: [sr-dev] [kamailio/kamailio] Feature Request: possibility to reload dialogs from db without restarting kamailio (#1274)
This feature is really interesting, any news about this? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/kamailio/kamailio/issues/1274#issuecomment-357186272___ Kamailio (SER) - Development Mailing List sr-dev@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev