Perhaps a simpler way look at it: buffers. It's in the name - they
buffer things.
If your "work queue" buffer (which is what the packet RecvQ
fundamentally is) is chronically full, to the degree that it needs
increased significantly, it means things aren't consistently leaving out
at the other end at the same velocity that they're coming in.
In small amounts, this regulating capacity is acceptable and
necessary--that's why buffers exist.
But increasing the depth of the queue by 78x (if I'm not mistaken,
212992 is the default--at least, it is on all my CentOS 7.x and 8.x
systems, which I guess also have "absolutely terrible sysctl defaults")
is faker than a Ponzi scheme. In some other contexts, this would be
called morally bankrupt and intellectually fraudulent. I guess here we
call it "mad dialer CPS" or whatever.
-- Alex
On 6/12/20 6:50 PM, Alex Balashov wrote:
There's no free lunch, but it seems like you and others want one. :-)
Increasing these values just increases the depth of the kernel's packet
queue for the sync processes to consume as able. It doesn't mean they're
able, and accordingly, request response time will go up.
A healthy system that is able to keep up with the load you're throwing
at it should show a receive queue at +/- 0 most of the time, maybe with
some ephemeral spikes but generally trending around 0. If packets are
stacking up in the RecvQ, it means the SIP worker processes aren't
available enough to consume them all in a timely fashion.
Leaning on async won't help here if the workload is largely CPU-bound.
If it's largely bound over waiting on network I/O from external
services, it merely deputises the problem of notifying you when there's
a response from those services to the kernel. But - vitally - it won't
get your requests processed faster, and setup latency is a very
important consideration in real-time communications, especially from the
perspective of interoperability with the synchronous/circuit-switched PSTN.
In short, async isn't magic, and neither is increasing the receive
queue. It's simple thermodynamics; there's only so much CPU available,
and depending on the nature of the workload, throughput becomes more a
linear function of available CPU hardware threads, or less, but slower,
if it's largely I/O-bound.
The metaphor of a balloon is appropriate. You're pushing the problem
around by squeezing one part of the balloon, causing another to enlarge.
Various parts of the balloon can be squeezed - async vs. sync, various
queues and buffers, etc. But the internal volume of air held by the
balloon is more or less the same. A little slack can be added into the
system through your rmem_max technique, as long as you're willing to
tolerate increased processing latency--and it will generate increased
latency; if it didn't, you wouldn't need to increase it--but ultimately,
you're just pushing the air around the balloon. A fixed amount of CPU
and memory is available to accommodate the large number of processes
that sleep on an external I/O-bound workload, and there are diminishing
returns from both internal OpenSIPS contention and context switching.
I'm not saying there aren't some local minima and maxima, but they
aren't as magnitudinal as folks think. It's not that Ubuntu Server is
mistuned, it's that you're abusing it. :-) You can't put the milk back
in the cow, although it's quite a spectacle ...
-- Alex
On 6/12/20 6:02 PM, Calvin Ellison wrote:
I noticed a way-too-small receive buffer value in the OpenSIPS startup
messages and it turns out that a fresh Ubuntu 18 Server install has
absolutely terrible sysctl defaults for high-performance networking. I
got my 8-core lab from less than 2,000 CPS up to 14,000 CPS using a
spread of all dips in non-async mode just by setting the following to
match "maxbuffer=16777216":
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
Does OpenSIPS have guidelines for sysclt and other OS parameters?
Async requires the TM module which adds additional overhead and
memory allocation.
According to with the docs:
"By requiring less processes to complete the same amount of work in the
same amount of time, process context switching is minimized and
overall CPU usage is improved. Less processes will also eat up less
system memory."
So which is it? When should async be used, and when should async not
be used? One can only invest so many hours in load testing
combinations of sync/async, the number of children, timer_partitions,
etc. Some fuzzy math based on CPU core count, SpecInt Rate, BogoMIPS,
etc. would be a great starting point.
Regards,
*Calvin Ellison*
Senior Voice Operations Engineer
calvin.elli...@voxox.com <mailto:calvin.elli...@voxox.com>
_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
--
Alex Balashov | Principal | Evariste Systems LLC
Tel: +1-706-510-6800 / +1-800-250-5920 (toll-free)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/
_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users