We're trying to load up opensips with as many TCP connections as we
possibly can. So far we've got it to about 82K, but failures start
occurring at that point. We have 8GBs of RAM allocated to the server as
a whole (is that enough? we don't appear to be exhausting it). We've set
the following parameters for OpenSIPS:
tcp_children=32
tcp_max_connections=250000
tcp_connection_lifetime=610
tcp_keepalive=1
tcp_keepcount=3
tcp_keepidle=300
tcp_keepinterval=300
We have also set ulimit -n 1024000 and ulimit -s 768.
The scenario is that our load driver establishes "client" connections to
OpenSIPS via TCP, and sends REGISTERs over those connections. While the
REGISTERs come in over TCP, they are sent out to our registrar via UDP.
Around the point where we get to the 40K connection mark we start seeing
the following in the logs:
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]:
ERROR:core:tcp_blocking_connect: poll error: flags 1c
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]:
ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (111)
Connection refused
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]:
ERROR:core:tcpconn_connect: tcp_blocking_connect failed
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:core:tcp_send:
connect failed
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:tm:msg_send:
tcp_send failed
It almost appears as though opensips is trying to establish a connection
somewhere and is being refused. Except that it shouldn't be trying to
establish any, unless it's for internal purposes. Unfortunately the logs
aren't clear on that point (in terms of what connection is trying to be
established).
One other thing that appears puzzling: it seems that one of the opensips
processes is bearing most of the brunt. I am assuming that it's the
instance that is actually accepting the connections, and that the
subsequent (low) amount of traffic is then handed off to the children.
But if that's the case, it also means that it's handling a lot of the
workload, and I was hoping that it would be more evenly distributed.
Here is a snapshot of the opensips processes in top:
27577 rcsuser 20 0 6516m 2.5g 2.5g R 76 31.9 8:15.26 opensips
27542 rcsuser 20 0 6516m 181m 180m S 16 2.3 0:54.60 opensips
27541 rcsuser 20 0 6516m 182m 180m S 14 2.3 0:54.47 opensips
27539 rcsuser 20 0 6516m 182m 180m S 13 2.3 0:53.75 opensips
27540 rcsuser 20 0 6516m 182m 180m S 11 2.3 0:53.64 opensips
27545 rcsuser 20 0 6516m 37m 29m S 0 0.5 0:01.03 opensips
27551 rcsuser 20 0 6516m 35m 27m S 0 0.4 0:00.94 opensips
27553 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.95 opensips
27555 rcsuser 20 0 6516m 37m 29m S 0 0.5 0:00.99 opensips
27557 rcsuser 20 0 6516m 35m 27m S 0 0.4 0:00.92 opensips
27558 rcsuser 20 0 6516m 35m 27m S 0 0.4 0:00.90 opensips
27560 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.98 opensips
27563 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.94 opensips
27564 rcsuser 20 0 6516m 36m 27m S 0 0.5 0:00.93 opensips
27565 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.93 opensips
27567 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.95 opensips
27575 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.95 opensips
27576 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.98 opensips
So basically what I'm looking for is some help on getting the operating
system and opensips tuned to the point where we can get substantially
more than 80K connections. Or am I asking for too much?
Thanks,
Gavin
_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users