Hello Gavin,

The errors you get indicates that OpenSIPS is trying to open a TCP connection to a destination which does not accept it. Based on your description, I would say there is not need for OpenSIPS to open TCP connections - they will be open by the clients when registering.

Ruling out the scenario of a misrouting , the only explanation will be that the TCP connections expires (timeout without traffic) long before the corresponding registration - so you end up with a registration (in usrloc) which has no TCP conn towards the actual device. Are you using the tcp_persistent_flag ? http://www.opensips.org/html/docs/modules/1.9.x/registrar.html#id250105

About the load on the processes, you can do "opensipsctl fifo ps" to get the listing of the processes and their description - you could correlate with the TOP info to see what's the process burning CPU

Regards,

Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com


On 04/26/2013 05:44 PM, Gavin Murphy wrote:
We're trying to load up opensips with as many TCP connections as we possibly can. So far we've got it to about 82K, but failures start occurring at that point. We have 8GBs of RAM allocated to the server as a whole (is that enough? we don't appear to be exhausting it). We've set the following parameters for OpenSIPS:

tcp_children=32
tcp_max_connections=250000
tcp_connection_lifetime=610
tcp_keepalive=1
tcp_keepcount=3
tcp_keepidle=300
tcp_keepinterval=300

We have also set ulimit -n 1024000 and ulimit -s 768.

The scenario is that our load driver establishes "client" connections to OpenSIPS via TCP, and sends REGISTERs over those connections. While the REGISTERs come in over TCP, they are sent out to our registrar via UDP. Around the point where we get to the 40K connection mark we start seeing the following in the logs:

Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:core:tcp_blocking_connect: poll error: flags 1c Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (111) Connection refused Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:core:tcpconn_connect: tcp_blocking_connect failed Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:core:tcp_send: connect failed Apr 25 12:28:19 blackmamba rcsuser-opensips[27540]: ERROR:tm:msg_send: tcp_send failed

It almost appears as though opensips is trying to establish a connection somewhere and is being refused. Except that it shouldn't be trying to establish any, unless it's for internal purposes. Unfortunately the logs aren't clear on that point (in terms of what connection is trying to be established).

One other thing that appears puzzling: it seems that one of the opensips processes is bearing most of the brunt. I am assuming that it's the instance that is actually accepting the connections, and that the subsequent (low) amount of traffic is then handed off to the children. But if that's the case, it also means that it's handling a lot of the workload, and I was hoping that it would be more evenly distributed.

Here is a snapshot of the opensips processes in top:

27577 rcsuser   20   0 6516m 2.5g 2.5g R   76 31.9   8:15.26 opensips
27542 rcsuser   20   0 6516m 181m 180m S   16  2.3   0:54.60 opensips
27541 rcsuser   20   0 6516m 182m 180m S   14  2.3   0:54.47 opensips
27539 rcsuser   20   0 6516m 182m 180m S   13  2.3   0:53.75 opensips
27540 rcsuser   20   0 6516m 182m 180m S   11  2.3   0:53.64 opensips
27545 rcsuser   20   0 6516m  37m  29m S    0  0.5   0:01.03 opensips
27551 rcsuser   20   0 6516m  35m  27m S    0  0.4   0:00.94 opensips
27553 rcsuser   20   0 6516m  36m  28m S    0  0.5   0:00.95 opensips
27555 rcsuser   20   0 6516m  37m  29m S    0  0.5   0:00.99 opensips
27557 rcsuser   20   0 6516m  35m  27m S    0  0.4   0:00.92 opensips
27558 rcsuser   20   0 6516m  35m  27m S    0  0.4   0:00.90 opensips
27560 rcsuser   20   0 6516m  36m  28m S    0  0.5   0:00.98 opensips
27563 rcsuser   20   0 6516m  36m  28m S    0  0.5   0:00.94 opensips
27564 rcsuser   20   0 6516m  36m  27m S    0  0.5   0:00.93 opensips
27565 rcsuser   20   0 6516m  36m  28m S    0  0.5   0:00.93 opensips
27567 rcsuser   20   0 6516m  36m  28m S    0  0.5   0:00.95 opensips
27575 rcsuser   20   0 6516m  36m  28m S    0  0.5   0:00.95 opensips
27576 rcsuser   20   0 6516m  36m  28m S    0  0.5   0:00.98 opensips

So basically what I'm looking for is some help on getting the operating system and opensips tuned to the point where we can get substantially more than 80K connections. Or am I asking for too much?

Thanks,

Gavin


_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

Reply via email to