Perhaps a simpler way look at it: buffers. It's in the name - they buffer things.

If your "work queue" buffer (which is what the packet RecvQ fundamentally is) is chronically full, to the degree that it needs increased significantly, it means things aren't consistently leaving out at the other end at the same velocity that they're coming in.

In small amounts, this regulating capacity is acceptable and necessary--that's why buffers exist.

But increasing the depth of the queue by 78x (if I'm not mistaken, 212992 is the default--at least, it is on all my CentOS 7.x and 8.x systems, which I guess also have "absolutely terrible sysctl defaults") is faker than a Ponzi scheme. In some other contexts, this would be called morally bankrupt and intellectually fraudulent. I guess here we call it "mad dialer CPS" or whatever.

-- Alex

On 6/12/20 6:50 PM, Alex Balashov wrote:
There's no free lunch, but it seems like you and others want one. :-) Increasing these values just increases the depth of the kernel's packet queue for the sync processes to consume as able. It doesn't mean they're able, and accordingly, request response time will go up.

A healthy system that is able to keep up with the load you're throwing at it should show a receive queue at +/- 0 most of the time, maybe with some ephemeral spikes but generally trending around 0. If packets are stacking up in the RecvQ, it means the SIP worker processes aren't available enough to consume them all in a timely fashion.

Leaning on async won't help here if the workload is largely CPU-bound. If it's largely bound over waiting on network I/O from external services, it merely deputises the problem of notifying you when there's a response from those services to the kernel. But - vitally - it won't get your requests processed faster, and setup latency is a very important consideration in real-time communications, especially from the perspective of interoperability with the synchronous/circuit-switched PSTN.

In short, async isn't magic, and neither is increasing the receive queue. It's simple thermodynamics; there's only so much CPU available, and depending on the nature of the workload, throughput becomes more a linear function of available CPU hardware threads, or less, but slower, if it's largely I/O-bound.

The metaphor of a balloon is appropriate. You're pushing the problem around by squeezing one part of the balloon, causing another to enlarge. Various parts of the balloon can be squeezed - async vs. sync, various queues and buffers, etc. But the internal volume of air held by the balloon is more or less the same. A little slack can be added into the system through your rmem_max technique, as long as you're willing to tolerate increased processing latency--and it will generate increased latency; if it didn't, you wouldn't need to increase it--but ultimately, you're just pushing the air around the balloon. A fixed amount of CPU and memory is available to accommodate the large number of processes that sleep on an external I/O-bound workload, and there are diminishing returns from both internal OpenSIPS contention and context switching.

I'm not saying there aren't some local minima and maxima, but they aren't as magnitudinal as folks think. It's not that Ubuntu Server is mistuned, it's that you're abusing it. :-) You can't put the milk back in the cow, although it's quite a spectacle ...

-- Alex

On 6/12/20 6:02 PM, Calvin Ellison wrote:
I noticed a way-too-small receive buffer value in the OpenSIPS startup messages and it turns out that a fresh Ubuntu 18 Server install has absolutely terrible sysctl defaults for high-performance networking. I got my 8-core lab from less than 2,000 CPS up to 14,000 CPS using a spread of all dips in non-async mode just by setting the following to match "maxbuffer=16777216":

net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

Does OpenSIPS have guidelines for sysclt and other OS parameters?

    Async requires the TM module which adds additional overhead and
    memory allocation.


According to with the docs:
"By requiring less processes to complete the same amount of work in the
same amount of time, process context switching is minimized and
overall CPU usage is improved. Less processes will also eat up less
system memory."

So which is it? When should async be used, and when should async not be used? One can only invest so many hours in load testing combinations of sync/async, the number of children, timer_partitions, etc. Some fuzzy math based on CPU core count, SpecInt Rate, BogoMIPS, etc. would be a great starting point.

Regards,

*Calvin Ellison*
Senior Voice Operations Engineer
calvin.elli...@voxox.com <mailto:calvin.elli...@voxox.com>


_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users



--
Alex Balashov | Principal | Evariste Systems LLC

Tel: +1-706-510-6800 / +1-800-250-5920 (toll-free)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/

_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

Reply via email to