Re: [OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-10 Thread Saúl Ibarra Corretgé
Hi,

On Apr 5, 2012, at 10:51 PM, Jock McKechnie wrote:

> Thank you for your suggestions;
> 
> I have noticed a very strange symptom but I've yet to determine if it
> actually affects call handling or not. When the -relay is heavily
> loaded it'll have the load spread out across the cores and then,
> suddenly, the CPU usage appears to drift over to a single core and max
> it out for a bit, with the other cores doing nothing and then it
> all spreads out again. The heavier loaded it is, the more time it
> spends on one core. Very strange. I installed irqbalance but it
> appears not to make a difference.
> 

If you are bombarding the server with calls continuously, you could see a CPU 
spike, since the call setup is done in a single thread, but after the conntrack 
rule has been created the kernel takes care and load is shared across all 
cores. Though I have never experienced this.

> I'm hoping the worst this may cause is a slight delay in a call
> starting up with an allocated pair of media ports in iptables
> forwarding, rather than call audio distortion. Have you ever seen
> anything like this?
> 

Since MediaProxy doesn't 'touch' the actual media I don't think it can cause 
distortion. Now, if the system is so overloaded that packets can't leave the 
server 'on time', you may have jitter issues. This is just a hypothesis, 
because it would mean that your server is so overloaded that even sending some 
UDP data is a problem...


Regards,

--
Saúl Ibarra Corretgé
AG Projects




___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-05 Thread Jock McKechnie
Thank you for your suggestions;

I have noticed a very strange symptom but I've yet to determine if it
actually affects call handling or not. When the -relay is heavily
loaded it'll have the load spread out across the cores and then,
suddenly, the CPU usage appears to drift over to a single core and max
it out for a bit, with the other cores doing nothing and then it
all spreads out again. The heavier loaded it is, the more time it
spends on one core. Very strange. I installed irqbalance but it
appears not to make a difference.

I'm hoping the worst this may cause is a slight delay in a call
starting up with an allocated pair of media ports in iptables
forwarding, rather than call audio distortion. Have you ever seen
anything like this?

 - JP

On Thu, Apr 5, 2012 at 6:58 AM, Saúl Ibarra Corretgé
 wrote:
> Hi,
>
>>
>> Saúl,
>>
>> You called it. Complete turn around in load-out - no more port
>> complaints and at 900 calls I'm seeing around 20% usage across four
>> cores. And the 'apt-get update' updates -relay for me, which I was
>> expecting to have to build (like I did initially), so I'm very
>> pleased.
>>
>
> I'm happy it works for you now :-)
>
>> When the system is humming along at 900 calls I started noticing these pop 
>> up:
>> warning: Aggregate speed calculation time exceeded 10ms: 10401us for
>> 431 sessions
>> Googling shows that this is related to some statistics, so I've turned
>> it off for now to see how far I can push media-proxy.
>>
>
> It's just an indicator of how much data MediaProxy is relaying, of course as 
> number of sessions goes up, this calculation takes time, and while  this 
> calculation is in progress the relay can't accept new sessions. You can 
> either sample at longer intervals, thus having a less accurate measurement, 
> or just disable it, in case you don't care about it.
>
>> Would you be able to recommend some other settings that I should make
>> to help mediaproxy-relay push as many calls as possible?
>>
>
> Given the fact that the relayed traffic is UDP I'm not sure if something can 
> be tweaked in the kernel, but one thing you can check is interrupts. See if 
> they are hitting a single CPU (check /proc/interrupts) and if so install 
> irqbalance so that interrupts are balanced among the cores. Good network 
> cards also help, of course :-)
>
> Given the fact that MediaProxy doesn't do much while the kernel is relaying 
> the traffic, you'll probably hit limits related to networking first. 
> Nevertheless, MediaProxy was designed with horizontal scalability in mind, so 
> if you need to handle more calls, you can add more relays :-) Also don't 
> forget that a relay can be connected to several dispatchers.
>
>
> Regards,
>
> --
> Saúl Ibarra Corretgé
> AG Projects
>
>
>
>
> ___
> Users mailing list
> Users@lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users

___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-05 Thread Saúl Ibarra Corretgé
Hi,

> 
> Saúl,
> 
> You called it. Complete turn around in load-out - no more port
> complaints and at 900 calls I'm seeing around 20% usage across four
> cores. And the 'apt-get update' updates -relay for me, which I was
> expecting to have to build (like I did initially), so I'm very
> pleased.
> 

I'm happy it works for you now :-)

> When the system is humming along at 900 calls I started noticing these pop up:
> warning: Aggregate speed calculation time exceeded 10ms: 10401us for
> 431 sessions
> Googling shows that this is related to some statistics, so I've turned
> it off for now to see how far I can push media-proxy.
> 

It's just an indicator of how much data MediaProxy is relaying, of course as 
number of sessions goes up, this calculation takes time, and while  this 
calculation is in progress the relay can't accept new sessions. You can either 
sample at longer intervals, thus having a less accurate measurement, or just 
disable it, in case you don't care about it.

> Would you be able to recommend some other settings that I should make
> to help mediaproxy-relay push as many calls as possible?
> 

Given the fact that the relayed traffic is UDP I'm not sure if something can be 
tweaked in the kernel, but one thing you can check is interrupts. See if they 
are hitting a single CPU (check /proc/interrupts) and if so install irqbalance 
so that interrupts are balanced among the cores. Good network cards also help, 
of course :-)

Given the fact that MediaProxy doesn't do much while the kernel is relaying the 
traffic, you'll probably hit limits related to networking first. Nevertheless, 
MediaProxy was designed with horizontal scalability in mind, so if you need to 
handle more calls, you can add more relays :-) Also don't forget that a relay 
can be connected to several dispatchers.


Regards,

--
Saúl Ibarra Corretgé
AG Projects




___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-04 Thread Jock McKechnie
>> I'm running a Deb Wheezy/Sid (unstable) release to keep up with the
>> latest dependencies for MediaProxy's build - which, I admit, I'm using
>> a build package from a few months ago.
>>
>> I've got iptables v1.4.12.2 running, with MediaProxy 2.5.1 (according
>> to the dpkg information after the debuild), so slightly behind that
>> fixed descriptor leak release.
>>
>> The loading on the box was clearly not right with whatever seems to be
>> going wrong, so my making any kind of assumptions on how well
>> MediaProxy works is unfair until I've got this sorted out.
>>
>
> The problem is indeed the file descriptor leak, which was fixed between 2.5.1 
> and 2.5.2. In case you want to verify this yourself, just use lsof on the 
> media-relay PID and start a call: 4 new descriptors will show up, but after 
> the call is ended they are not released.
>
> Please do upgrade to MediaProxy 2.5.2 and test again :-)
>
> FYI, we do have a public Debian repository with MediaProxy built for several 
> Debian and Ubuntu versions, check it out: 
> http://mediaproxy.ag-projects.com/projects/mediaproxy/wiki/InstallationGuide

Saúl,

You called it. Complete turn around in load-out - no more port
complaints and at 900 calls I'm seeing around 20% usage across four
cores. And the 'apt-get update' updates -relay for me, which I was
expecting to have to build (like I did initially), so I'm very
pleased.

When the system is humming along at 900 calls I started noticing these pop up:
warning: Aggregate speed calculation time exceeded 10ms: 10401us for
431 sessions
Googling shows that this is related to some statistics, so I've turned
it off for now to see how far I can push media-proxy.

Would you be able to recommend some other settings that I should make
to help mediaproxy-relay push as many calls as possible?

I'm very grateful for your help;

 - Jock

___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-04 Thread Saúl Ibarra Corretgé
Hi Jock,

On Apr 4, 2012, at 3:10 PM, Jock McKechnie wrote:

> Thank you, Saúl, for your swift reply.
> 
> I'm running a Deb Wheezy/Sid (unstable) release to keep up with the
> latest dependencies for MediaProxy's build - which, I admit, I'm using
> a build package from a few months ago.
> 
> I've got iptables v1.4.12.2 running, with MediaProxy 2.5.1 (according
> to the dpkg information after the debuild), so slightly behind that
> fixed descriptor leak release.
> 
> The loading on the box was clearly not right with whatever seems to be
> going wrong, so my making any kind of assumptions on how well
> MediaProxy works is unfair until I've got this sorted out.
> 

The problem is indeed the file descriptor leak, which was fixed between 2.5.1 
and 2.5.2. In case you want to verify this yourself, just use lsof on the 
media-relay PID and start a call: 4 new descriptors will show up, but after the 
call is ended they are not released.

Please do upgrade to MediaProxy 2.5.2 and test again :-)

FYI, we do have a public Debian repository with MediaProxy built for several 
Debian and Ubuntu versions, check it out: 
http://mediaproxy.ag-projects.com/projects/mediaproxy/wiki/InstallationGuide


Regards,

--
Saúl Ibarra Corretgé
AG Projects




___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-04 Thread Jock McKechnie
Thank you, Saúl, for your swift reply.

I'm running a Deb Wheezy/Sid (unstable) release to keep up with the
latest dependencies for MediaProxy's build - which, I admit, I'm using
a build package from a few months ago.

I've got iptables v1.4.12.2 running, with MediaProxy 2.5.1 (according
to the dpkg information after the debuild), so slightly behind that
fixed descriptor leak release.

The loading on the box was clearly not right with whatever seems to be
going wrong, so my making any kind of assumptions on how well
MediaProxy works is unfair until I've got this sorted out.

Thank you, again.

 - JP


On Wed, Apr 4, 2012 at 1:46 AM, Saúl Ibarra Corretgé
 wrote:
> Hi Jock,
>
> What MediaProxy version are you running?
>
> On Apr 3, 2012, at 10:50 PM, Jock McKechnie wrote:
>
>> Greetings all;
>>
>> We have several mediaproxy systems running in small scale production
>> (~50-100 calls concurrently) and have been very pleased with the
>> results. We find that we have to restart the relay/dispatcher machines
>> daily to keep them ticking over (they tend to get lost on their own
>> after a few days runtime), but this is a minor inconvenience.
>>
>
> What do you mean by "get lost on their own"?
>
>> Until today. Today I tried moving one of our small carrier circuits
>> over to it and gee whiz did all sorts of exciting things happen. I
>> have our systems set up with an initial OpenSIPS/media-dispatcher
>> running on a VM (public IP). This dispatcher speaks to a blade server
>> which is running a single media-relay instance.
>>
>> Under light load all is well. When the load starts ramping up (800+
>> calls) thing start going a bit pear-shaped, however. I end up with
>> massive numbers of entries like this in the syslog of the relay:
>> Cannot use port pair 53378/53379
>> Which appears to bog the whole relay down to the point where it's
>> using 100% of the core. Even after turning the calls back off, the
>> -relay remains at 100% and continues to dump more 'Cannot use port
>> pair' notices into rsyslog and is impossible to stop normally due to
>> it being so tied up. rsyslog was not loaded out in the 'top', so
>> although it was clearly being hammered by -relay, I don't think
>> rsyslog was the bottleneck here.
>>
>
> There was a very nasty bug after an API change in iptables which caused 
> socket descriptors to be leaked, which led to this situation. What version of 
> iptables are you using? (iptables -V).
>
>> I guess my first question is, what am I doing wrong here to cause it
>> to be pushing literally tens of thousands of these errors?
>>
>> And then, next, how do I best tune mediaproxy to handle larger loads?
>> I was thinking I could run several -relays on a single blade as they
>> appear to be single-threaded and, therefore, multiple forks will load
>> across the machine properly... but I'm not even sure if -relay can use
>> a different conf file to the default.
>>
>
> Yes, MediaProxy is single threaded, but the actual relaying of packets happen 
> in *kernel space*, not in that single thread. Thus, you shouldn't run more 
> than one relay in a single box, and that's why it's not even supported. If 
> one box it's not enough, just add another one with another instance of 
> MediaProxy relay :-)
>
>> The dispatcher, which as I said lives on the OpenSIPS vm, looks like this:
>> [Dispatcher]
>> socket_path=/tmp/dispatcher.sock
>> listen=dispatcher.public.ip.address
>> management_use_tls=no
>> log_level=WARNING
>>
>> The relay, on a Dell M610 blade, looks like:
>> [Relay]
>> dispatchers=dispatcher.public.ip.address
>> relay_ip=relay.public.ip.address
>> port_range=5:6
>> log_level=WARNING
>>
>> Any suggestions would be gratefully received;
>>
>
>
> Regards,
>
> --
> Saúl Ibarra Corretgé
> AG Projects
>
>
>
>
> ___
> Users mailing list
> Users@lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users

___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-03 Thread Saúl Ibarra Corretgé
Hi Jock,

What MediaProxy version are you running?

On Apr 3, 2012, at 10:50 PM, Jock McKechnie wrote:

> Greetings all;
> 
> We have several mediaproxy systems running in small scale production
> (~50-100 calls concurrently) and have been very pleased with the
> results. We find that we have to restart the relay/dispatcher machines
> daily to keep them ticking over (they tend to get lost on their own
> after a few days runtime), but this is a minor inconvenience.
> 

What do you mean by "get lost on their own"?

> Until today. Today I tried moving one of our small carrier circuits
> over to it and gee whiz did all sorts of exciting things happen. I
> have our systems set up with an initial OpenSIPS/media-dispatcher
> running on a VM (public IP). This dispatcher speaks to a blade server
> which is running a single media-relay instance.
> 
> Under light load all is well. When the load starts ramping up (800+
> calls) thing start going a bit pear-shaped, however. I end up with
> massive numbers of entries like this in the syslog of the relay:
> Cannot use port pair 53378/53379
> Which appears to bog the whole relay down to the point where it's
> using 100% of the core. Even after turning the calls back off, the
> -relay remains at 100% and continues to dump more 'Cannot use port
> pair' notices into rsyslog and is impossible to stop normally due to
> it being so tied up. rsyslog was not loaded out in the 'top', so
> although it was clearly being hammered by -relay, I don't think
> rsyslog was the bottleneck here.
> 

There was a very nasty bug after an API change in iptables which caused socket 
descriptors to be leaked, which led to this situation. What version of iptables 
are you using? (iptables -V).

> I guess my first question is, what am I doing wrong here to cause it
> to be pushing literally tens of thousands of these errors?
> 
> And then, next, how do I best tune mediaproxy to handle larger loads?
> I was thinking I could run several -relays on a single blade as they
> appear to be single-threaded and, therefore, multiple forks will load
> across the machine properly... but I'm not even sure if -relay can use
> a different conf file to the default.
> 

Yes, MediaProxy is single threaded, but the actual relaying of packets happen 
in *kernel space*, not in that single thread. Thus, you shouldn't run more than 
one relay in a single box, and that's why it's not even supported. If one box 
it's not enough, just add another one with another instance of MediaProxy relay 
:-)

> The dispatcher, which as I said lives on the OpenSIPS vm, looks like this:
> [Dispatcher]
> socket_path=/tmp/dispatcher.sock
> listen=dispatcher.public.ip.address
> management_use_tls=no
> log_level=WARNING
> 
> The relay, on a Dell M610 blade, looks like:
> [Relay]
> dispatchers=dispatcher.public.ip.address
> relay_ip=relay.public.ip.address
> port_range=5:6
> log_level=WARNING
> 
> Any suggestions would be gratefully received;
> 


Regards,

--
Saúl Ibarra Corretgé
AG Projects




___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


[OpenSIPS-Users] MediaProxy loading issues - I think I need some tuning here

2012-04-03 Thread Jock McKechnie
Greetings all;

We have several mediaproxy systems running in small scale production
(~50-100 calls concurrently) and have been very pleased with the
results. We find that we have to restart the relay/dispatcher machines
daily to keep them ticking over (they tend to get lost on their own
after a few days runtime), but this is a minor inconvenience.

Until today. Today I tried moving one of our small carrier circuits
over to it and gee whiz did all sorts of exciting things happen. I
have our systems set up with an initial OpenSIPS/media-dispatcher
running on a VM (public IP). This dispatcher speaks to a blade server
which is running a single media-relay instance.

Under light load all is well. When the load starts ramping up (800+
calls) thing start going a bit pear-shaped, however. I end up with
massive numbers of entries like this in the syslog of the relay:
Cannot use port pair 53378/53379
Which appears to bog the whole relay down to the point where it's
using 100% of the core. Even after turning the calls back off, the
-relay remains at 100% and continues to dump more 'Cannot use port
pair' notices into rsyslog and is impossible to stop normally due to
it being so tied up. rsyslog was not loaded out in the 'top', so
although it was clearly being hammered by -relay, I don't think
rsyslog was the bottleneck here.

I guess my first question is, what am I doing wrong here to cause it
to be pushing literally tens of thousands of these errors?

And then, next, how do I best tune mediaproxy to handle larger loads?
I was thinking I could run several -relays on a single blade as they
appear to be single-threaded and, therefore, multiple forks will load
across the machine properly... but I'm not even sure if -relay can use
a different conf file to the default.

The dispatcher, which as I said lives on the OpenSIPS vm, looks like this:
[Dispatcher]
socket_path=/tmp/dispatcher.sock
listen=dispatcher.public.ip.address
management_use_tls=no
log_level=WARNING

The relay, on a Dell M610 blade, looks like:
[Relay]
dispatchers=dispatcher.public.ip.address
relay_ip=relay.public.ip.address
port_range=5:6
log_level=WARNING

Any suggestions would be gratefully received;

 - Jock

___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users