Hello Gavin,
The errors you get indicates that OpenSIPS is trying to open a TCP
connection to a destination which does not accept it. Based on your
description, I would say there is not need for OpenSIPS to open TCP
connections - they will be open by the clients when registering.
Ruling out the scenario of a misrouting , the only explanation will be
that the TCP connections expires (timeout without traffic) long before
the corresponding registration - so you end up with a registration (in
usrloc) which has no TCP conn towards the actual device. Are you using
the tcp_persistent_flag ?
http://www.opensips.org/html/docs/modules/1.9.x/registrar.html#id250105
About the load on the processes, you can do "opensipsctl fifo ps" to get
the listing of the processes and their description - you could correlate
with the TOP info to see what's the process burning CPU
Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 04/26/2013 05:44 PM, Gavin Murphy wrote:
We're trying to load up opensips with as many TCP connections as we
possibly can. So far we've got it to about 82K, but failures start
occurring at that point. We have 8GBs of RAM allocated to the server
as a whole (is that enough? we don't appear to be exhausting it).
We've set the following parameters for OpenSIPS:
tcp_children=32
tcp_max_connections=250000
tcp_connection_lifetime=610
tcp_keepalive=1
tcp_keepcount=3
tcp_keepidle=300
tcp_keepinterval=300
We have also set ulimit -n 1024000 and ulimit -s 768.
The scenario is that our load driver establishes "client" connections
to OpenSIPS via TCP, and sends REGISTERs over those connections. While
the REGISTERs come in over TCP, they are sent out to our registrar via
UDP. Around the point where we get to the 40K connection mark we start
seeing the following in the logs:
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]:
ERROR:core:tcp_blocking_connect: poll error: flags 1c
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]:
ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (111)
Connection refused
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]:
ERROR:core:tcpconn_connect: tcp_blocking_connect failed
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]:
ERROR:core:tcp_send: connect failed
Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:tm:msg_send:
tcp_send failed
It almost appears as though opensips is trying to establish a
connection somewhere and is being refused. Except that it shouldn't be
trying to establish any, unless it's for internal purposes.
Unfortunately the logs aren't clear on that point (in terms of what
connection is trying to be established).
One other thing that appears puzzling: it seems that one of the
opensips processes is bearing most of the brunt. I am assuming that
it's the instance that is actually accepting the connections, and that
the subsequent (low) amount of traffic is then handed off to the
children. But if that's the case, it also means that it's handling a
lot of the workload, and I was hoping that it would be more evenly
distributed.
Here is a snapshot of the opensips processes in top:
27577 rcsuser 20 0 6516m 2.5g 2.5g R 76 31.9 8:15.26 opensips
27542 rcsuser 20 0 6516m 181m 180m S 16 2.3 0:54.60 opensips
27541 rcsuser 20 0 6516m 182m 180m S 14 2.3 0:54.47 opensips
27539 rcsuser 20 0 6516m 182m 180m S 13 2.3 0:53.75 opensips
27540 rcsuser 20 0 6516m 182m 180m S 11 2.3 0:53.64 opensips
27545 rcsuser 20 0 6516m 37m 29m S 0 0.5 0:01.03 opensips
27551 rcsuser 20 0 6516m 35m 27m S 0 0.4 0:00.94 opensips
27553 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.95 opensips
27555 rcsuser 20 0 6516m 37m 29m S 0 0.5 0:00.99 opensips
27557 rcsuser 20 0 6516m 35m 27m S 0 0.4 0:00.92 opensips
27558 rcsuser 20 0 6516m 35m 27m S 0 0.4 0:00.90 opensips
27560 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.98 opensips
27563 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.94 opensips
27564 rcsuser 20 0 6516m 36m 27m S 0 0.5 0:00.93 opensips
27565 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.93 opensips
27567 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.95 opensips
27575 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.95 opensips
27576 rcsuser 20 0 6516m 36m 28m S 0 0.5 0:00.98 opensips
So basically what I'm looking for is some help on getting the
operating system and opensips tuned to the point where we can get
substantially more than 80K connections. Or am I asking for too much?
Thanks,
Gavin
_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users