Richard, Thank you very much for your response. Let me add some details. Initially we tested all component in VMs with 2 vCPUs each but after reading the list we changed bono config to 1 vCPU, for sprout we left 2 vCPUs. As for bono the crashes were somehow resolved but for sprout the situation is the same. We experience the same kind of crashes with all-in-one image (with 8vCPU). As for SNMP statistics I don't know whether this is related or not but we couldn't get bono/sprout functional statistics like: The number of incoming requests, indexed by time period or The number of requests rejected due to overload, indexed by time period. - those metrics were always zero. I've added a pair of chronos dumps to dropbox folder. Maybe they can shed some more light on the problem: https://www.dropbox.com/sh/qjdja9eowgvo1zc/AADm25_pwKNs3gWwBmb0Pzhpa?dl=0
BR, Stanislav Khalup From: Richard Whitehouse [mailto:richard.whiteho...@metaswitch.com] Sent: Friday, September 2, 2016 5:19 PM To: clearwater@lists.projectclearwater.org; Stanislav Khalup <skha...@virtuozzo.com> Cc: Denis Plotnikov <dplotni...@virtuozzo.com> Subject: RE: Sprout/Bono/Chronos crashes under stress test Stanislav, I've taken a look at the Sprout crash. It looks like you have are hitting a crash in the Net SNMP library we use for alarms and statistics. I've raised an issue to track this - https://github.com/Metaswitch/sprout/issues/1527 We've seen similar looking stacks for Bono before on multi-core VMs - e.g. under http://lists.projectclearwater.org/pipermail/clearwater_lists.projectclearwater.org/2015-January/001986.html Historically we've scaled up Sprout and Bono by running many single or dual-core instances rather than running fewer larger instances - this is because - we've seen virtualization environments impose per-VM limits on TCP connection counts, and obviously Bono has large numbers of TCP connections in a real-world scenario - we only support a single transport thread and, since Bono performs relatively little processing per message, and Sprout needs to perform some processing per message, it is this that ends up being the bottleneck quite quickly. Generally we've run single core Bono nodes, and dual core Sprout nodes. Having said that, we should look into why it's crashing when you're running more cores. Can you give us some description of the scenario you are running under when you see this? You might also find it useful to subscribe to the mailing list so that you receive updates when we push out updated releases. Thanks, Richard From: Clearwater [mailto:clearwater-boun...@lists.projectclearwater.org] On Behalf Of Stanislav Khalup Sent: 01 September 2016 10:49 To: clearwater@lists.projectclearwater.org<mailto:clearwater@lists.projectclearwater.org> Cc: Denis Plotnikov <dplotni...@virtuozzo.com<mailto:dplotni...@virtuozzo.com>> Subject: [Project Clearwater] Sprout/Bono/Chronos crashes under stress test Hello all, We've been trying to perform IMS stress testing for some time now but it seems that we are really unlucky. When we perform sip test we experience constant bono/sprout crashes which affects results of our performance evaluation. The thing is we do know that generally our deployment is working (we managed to perform calls and run tests). At first we manually deployed IMS cluster but after crashes we decided to try all in one VM but we still experience sprout crashes (bono crashes are mostly fixed after setting 1CPU/1Worker). Could you please look at the dumps: https://www.dropbox.com/sh/qjdja9eowgvo1zc/AADm25_pwKNs3gWwBmb0Pzhpa?dl=0 because for now we have no clue for what is happening. BR, Stanislav
_______________________________________________ Clearwater mailing list Clearwater@lists.projectclearwater.org http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org