Re: [OpenSIPS-Users] OpenSIPS timers

2022-04-11 Thread Ovidiu Sas
Just to conclude the thread. The issue here was high load combined
with the fact that tm has two timers (a second based timer *tm-timer*
that runs every second and millisecond based timer *tm-utimer* that
runs every 200ms). Both timers are protected by the same lock and the
timers cannot run in parallel. The second based timer *tm-timer*
sometimes takes more then 200ms to complete which prevents the
millisecond based timer *tm-utimer* to be executed in its 200ms
window.

-ovidiu

On Fri, Apr 1, 2022 at 10:10 AM Ovidiu Sas  wrote:
>
> Hello Bogdan,
>
> During my test, it was tm-utimer only. It was a typo on my side.
>
> I also see in the logs from time to time the other timers too,
> including tm-timer.
>
> What I noticed in my tests is that as soon as I increase the
> timer_partitions, the system is able to handle less cps until workers
> are becoming 100% loaded and calls starts failing (due to
> retransmissions and udp queue being full - the udp queue is quite big
> to accommodate spikes).
>
> Is there a way to make the timer lists more efficient (in terms of ops
> in shared memory)?
>
> Please take a look at the mentioned ticket as it makes the ratelimit
> module unusable (and maybe with side effects for other modules that
> require accurate timeslots).
> Basically, for a timer that is supposed to fire every second, the
> observed behaviour is that the timer fires at approx 1s (or less by a
> few ms) and then from time to time it fires at 1.8s and the cycle
> repeats.
>
> Thanks,
> Ovidiu
>
> On Fri, Apr 1, 2022 at 9:48 AM Bogdan-Andrei Iancu  
> wrote:
> >
> > Hi Ovidiu,
> >
> > Originally you mentioned tm-utimer, now tm-timerwhich one is ? As it
> > is very important.
> >
> > When increasing the timer_partitions, what you mean by "instability" of
> > the system?
> >
> > Yes, in the reactor, the UDP workers may handle timer jobs also beside
> > the UDP traffic. While the timer procs are 100% dedicated to the timer
> > jobs only. So yes, if the workers are idle, they can act as any timer
> > procs also.
> >
> > Increasing the TM_TABLE_ENTRIES should not impact too much, at the
> > performance over the timer lists (in TM) has nothing to do with the size
> > of the hash table.
> >
> > I will check the mentioned ticket, but if what you are saying is true on
> > the HP malloc, it means the bottle neck is actually in the ops on the
> > shared memory.
> >
> > Best regards,
> >
> > Bogdan-Andrei Iancu
> >
> > OpenSIPS Founder and Developer
> >https://www.opensips-solutions.com
> > OpenSIPS eBootcamp 23rd May - 3rd June 2022
> >https://opensips.org/training/OpenSIPS_eBootcamp_2022/
> >
> > On 4/1/22 12:31 AM, Ovidiu Sas wrote:
> > > Hello Bogdan,
> > >
> > > Thank you for looking into this!
> > >
> > > I get warnings mostly from tm-timer. I've seen warnings from
> > > blcore-expire, dlg-options-pinger, dlg-reinvite-pinger, dlg-timer (in
> > > the logs, but not during my testing).
> > > While testing, I saw only the tm-timer warnings.
> > >
> > > I took a superficial look at the "timer_partitions" and your
> > > explanation matches my findings. However, increasing the
> > > "timer_partitions" makes the system unstable (doesn't matter how many
> > > timer procs we have).
> > > I found that I can get the most out of the system if one
> > > "timer_partiton" is used along with one timer_proc.
> > >
> > > With the reactor scheme, a UDP receiver can handle timer jobs, is that
> > > right? If yes, if the UDP workers are idle, there are enough resources
> > > to handle timer jobs, correct?
> > >
> > > I was also increasing the TM_TABLE_ENTRIES to (1<<18) and there was a
> > > little bit of performance increase, but I will need to test more to
> > > come up with a valid conclusion.
> > >
> > > On the other hand, I noticed a strange behavior on timer handling.
> > > Take a look at:
> > > https://github.com/OpenSIPS/opensips/issues/2797
> > > Not sure if this is related to the warnings that I'm seeing.
> > >
> > > The biggest performance improvement was switching to HP_MALLOC for
> > > both pkg and shm memory.
> > >
> > > I will keep you posted with my findings,
> > > Ovidiu
> > >
> > > On Thu, Mar 31, 2022 at 10:28 AM Bogdan-Andrei Iancu
> > >  wrote:
> > >> Hi Ovidiu,
> > >>
> > >> As warnings from the timer_ticker, do you get only for the tm-utimer
> > >> task ? I'm asking as the key question here is where the bottleneck is :
> > >> in the whole "timer" subsystem, or in the tm-utimer task only?
> > >>
> > >> The TM "timer_partitions" creates multiple parallel timer lists, to
> > >> avoid having large "amounts" of transactions handled at a moment in a
> > >> single tm-utimer task (but rather split/partition the whole of amount of
> > >> handled transactions into smaller chunks, to be handled one at a time in
> > >> the timer task.
> > >>
> > >> The "timer_workers" creates  more than one dedicated processes for
> > >> handling the timer tasks (so scales up the timer sub-system).
> > >>
> > >> If you get 

Re: [OpenSIPS-Users] OpenSIPS timers

2022-04-01 Thread Ovidiu Sas
Hello Bogdan,

During my test, it was tm-utimer only. It was a typo on my side.

I also see in the logs from time to time the other timers too,
including tm-timer.

What I noticed in my tests is that as soon as I increase the
timer_partitions, the system is able to handle less cps until workers
are becoming 100% loaded and calls starts failing (due to
retransmissions and udp queue being full - the udp queue is quite big
to accommodate spikes).

Is there a way to make the timer lists more efficient (in terms of ops
in shared memory)?

Please take a look at the mentioned ticket as it makes the ratelimit
module unusable (and maybe with side effects for other modules that
require accurate timeslots).
Basically, for a timer that is supposed to fire every second, the
observed behaviour is that the timer fires at approx 1s (or less by a
few ms) and then from time to time it fires at 1.8s and the cycle
repeats.

Thanks,
Ovidiu

On Fri, Apr 1, 2022 at 9:48 AM Bogdan-Andrei Iancu  wrote:
>
> Hi Ovidiu,
>
> Originally you mentioned tm-utimer, now tm-timerwhich one is ? As it
> is very important.
>
> When increasing the timer_partitions, what you mean by "instability" of
> the system?
>
> Yes, in the reactor, the UDP workers may handle timer jobs also beside
> the UDP traffic. While the timer procs are 100% dedicated to the timer
> jobs only. So yes, if the workers are idle, they can act as any timer
> procs also.
>
> Increasing the TM_TABLE_ENTRIES should not impact too much, at the
> performance over the timer lists (in TM) has nothing to do with the size
> of the hash table.
>
> I will check the mentioned ticket, but if what you are saying is true on
> the HP malloc, it means the bottle neck is actually in the ops on the
> shared memory.
>
> Best regards,
>
> Bogdan-Andrei Iancu
>
> OpenSIPS Founder and Developer
>https://www.opensips-solutions.com
> OpenSIPS eBootcamp 23rd May - 3rd June 2022
>https://opensips.org/training/OpenSIPS_eBootcamp_2022/
>
> On 4/1/22 12:31 AM, Ovidiu Sas wrote:
> > Hello Bogdan,
> >
> > Thank you for looking into this!
> >
> > I get warnings mostly from tm-timer. I've seen warnings from
> > blcore-expire, dlg-options-pinger, dlg-reinvite-pinger, dlg-timer (in
> > the logs, but not during my testing).
> > While testing, I saw only the tm-timer warnings.
> >
> > I took a superficial look at the "timer_partitions" and your
> > explanation matches my findings. However, increasing the
> > "timer_partitions" makes the system unstable (doesn't matter how many
> > timer procs we have).
> > I found that I can get the most out of the system if one
> > "timer_partiton" is used along with one timer_proc.
> >
> > With the reactor scheme, a UDP receiver can handle timer jobs, is that
> > right? If yes, if the UDP workers are idle, there are enough resources
> > to handle timer jobs, correct?
> >
> > I was also increasing the TM_TABLE_ENTRIES to (1<<18) and there was a
> > little bit of performance increase, but I will need to test more to
> > come up with a valid conclusion.
> >
> > On the other hand, I noticed a strange behavior on timer handling.
> > Take a look at:
> > https://github.com/OpenSIPS/opensips/issues/2797
> > Not sure if this is related to the warnings that I'm seeing.
> >
> > The biggest performance improvement was switching to HP_MALLOC for
> > both pkg and shm memory.
> >
> > I will keep you posted with my findings,
> > Ovidiu
> >
> > On Thu, Mar 31, 2022 at 10:28 AM Bogdan-Andrei Iancu
> >  wrote:
> >> Hi Ovidiu,
> >>
> >> As warnings from the timer_ticker, do you get only for the tm-utimer
> >> task ? I'm asking as the key question here is where the bottleneck is :
> >> in the whole "timer" subsystem, or in the tm-utimer task only?
> >>
> >> The TM "timer_partitions" creates multiple parallel timer lists, to
> >> avoid having large "amounts" of transactions handled at a moment in a
> >> single tm-utimer task (but rather split/partition the whole of amount of
> >> handled transactions into smaller chunks, to be handled one at a time in
> >> the timer task.
> >>
> >> The "timer_workers" creates  more than one dedicated processes for
> >> handling the timer tasks (so scales up the timer sub-system).
> >>
> >> If you get warnings only on tm-utimer, I suspect the bottleneck is TM
> >> related, mainly on performing re-transmissions (that's what that task is
> >> doing). So the increasing the timer-partitions should be the way to help.
> >>
> >> Best regards,
> >>
> >> Bogdan-Andrei Iancu
> >>
> >> OpenSIPS Founder and Developer
> >> https://www.opensips-solutions.com
> >> OpenSIPS eBootcamp 23rd May - 3rd June 2022
> >> https://opensips.org/training/OpenSIPS_eBootcamp_2022/
> >>
> >> On 3/24/22 12:54 AM, Ovidiu Sas wrote:
> >>> Hello all,
> >>>
> >>> I'm working on tuning an opensips server. I get this pesky:
> >>> WARNING:core:utimer_ticker: utimer task  already scheduled
> >>> I was trying to get rid of them by playing with the tm
> >>> timer_partitions parameter 

Re: [OpenSIPS-Users] OpenSIPS timers

2022-04-01 Thread Bogdan-Andrei Iancu

Hi Ovidiu,

Originally you mentioned tm-utimer, now tm-timerwhich one is ? As it 
is very important.


When increasing the timer_partitions, what you mean by "instability" of 
the system?


Yes, in the reactor, the UDP workers may handle timer jobs also beside 
the UDP traffic. While the timer procs are 100% dedicated to the timer 
jobs only. So yes, if the workers are idle, they can act as any timer 
procs also.


Increasing the TM_TABLE_ENTRIES should not impact too much, at the 
performance over the timer lists (in TM) has nothing to do with the size 
of the hash table.


I will check the mentioned ticket, but if what you are saying is true on 
the HP malloc, it means the bottle neck is actually in the ops on the 
shared memory.


Best regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  https://www.opensips-solutions.com
OpenSIPS eBootcamp 23rd May - 3rd June 2022
  https://opensips.org/training/OpenSIPS_eBootcamp_2022/

On 4/1/22 12:31 AM, Ovidiu Sas wrote:

Hello Bogdan,

Thank you for looking into this!

I get warnings mostly from tm-timer. I've seen warnings from
blcore-expire, dlg-options-pinger, dlg-reinvite-pinger, dlg-timer (in
the logs, but not during my testing).
While testing, I saw only the tm-timer warnings.

I took a superficial look at the "timer_partitions" and your
explanation matches my findings. However, increasing the
"timer_partitions" makes the system unstable (doesn't matter how many
timer procs we have).
I found that I can get the most out of the system if one
"timer_partiton" is used along with one timer_proc.

With the reactor scheme, a UDP receiver can handle timer jobs, is that
right? If yes, if the UDP workers are idle, there are enough resources
to handle timer jobs, correct?

I was also increasing the TM_TABLE_ENTRIES to (1<<18) and there was a
little bit of performance increase, but I will need to test more to
come up with a valid conclusion.

On the other hand, I noticed a strange behavior on timer handling.
Take a look at:
https://github.com/OpenSIPS/opensips/issues/2797
Not sure if this is related to the warnings that I'm seeing.

The biggest performance improvement was switching to HP_MALLOC for
both pkg and shm memory.

I will keep you posted with my findings,
Ovidiu

On Thu, Mar 31, 2022 at 10:28 AM Bogdan-Andrei Iancu
 wrote:

Hi Ovidiu,

As warnings from the timer_ticker, do you get only for the tm-utimer
task ? I'm asking as the key question here is where the bottleneck is :
in the whole "timer" subsystem, or in the tm-utimer task only?

The TM "timer_partitions" creates multiple parallel timer lists, to
avoid having large "amounts" of transactions handled at a moment in a
single tm-utimer task (but rather split/partition the whole of amount of
handled transactions into smaller chunks, to be handled one at a time in
the timer task.

The "timer_workers" creates  more than one dedicated processes for
handling the timer tasks (so scales up the timer sub-system).

If you get warnings only on tm-utimer, I suspect the bottleneck is TM
related, mainly on performing re-transmissions (that's what that task is
doing). So the increasing the timer-partitions should be the way to help.

Best regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
https://www.opensips-solutions.com
OpenSIPS eBootcamp 23rd May - 3rd June 2022
https://opensips.org/training/OpenSIPS_eBootcamp_2022/

On 3/24/22 12:54 AM, Ovidiu Sas wrote:

Hello all,

I'm working on tuning an opensips server. I get this pesky:
WARNING:core:utimer_ticker: utimer task  already scheduled
I was trying to get rid of them by playing with the tm
timer_partitions parameter and the timer_workers core param.
By increasing any of them doesn't increase performance.
By increasing both of them, it actually decreases performance.
The server is not at limit, the load on the UDP workers is around
50-60 with some spikes.
I have around 3500+ cps sipp traffic.

My understanding is that by increasing the number of timer_partitions,
we will have more procs walking in parallel over the timer structures.
If we have on timer structure, we have one proc walking over it.
How is this working for two timer structures? What is the difference
between the first and the second timer structure? Should we expect
less work for each proc?

For now, to reduce the occurrence of the warning log, I increased the
timer interval for tm-utimer from 100ms to 200ms. This should be ok as
the timer has the TIMER_FLAG_DELAY_ON_DELAY flag set.

Thanks,
Ovidiu






___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] OpenSIPS timers

2022-03-31 Thread Ovidiu Sas
Hello Bogdan,

Thank you for looking into this!

I get warnings mostly from tm-timer. I've seen warnings from
blcore-expire, dlg-options-pinger, dlg-reinvite-pinger, dlg-timer (in
the logs, but not during my testing).
While testing, I saw only the tm-timer warnings.

I took a superficial look at the "timer_partitions" and your
explanation matches my findings. However, increasing the
"timer_partitions" makes the system unstable (doesn't matter how many
timer procs we have).
I found that I can get the most out of the system if one
"timer_partiton" is used along with one timer_proc.

With the reactor scheme, a UDP receiver can handle timer jobs, is that
right? If yes, if the UDP workers are idle, there are enough resources
to handle timer jobs, correct?

I was also increasing the TM_TABLE_ENTRIES to (1<<18) and there was a
little bit of performance increase, but I will need to test more to
come up with a valid conclusion.

On the other hand, I noticed a strange behavior on timer handling.
Take a look at:
https://github.com/OpenSIPS/opensips/issues/2797
Not sure if this is related to the warnings that I'm seeing.

The biggest performance improvement was switching to HP_MALLOC for
both pkg and shm memory.

I will keep you posted with my findings,
Ovidiu

On Thu, Mar 31, 2022 at 10:28 AM Bogdan-Andrei Iancu
 wrote:
>
> Hi Ovidiu,
>
> As warnings from the timer_ticker, do you get only for the tm-utimer
> task ? I'm asking as the key question here is where the bottleneck is :
> in the whole "timer" subsystem, or in the tm-utimer task only?
>
> The TM "timer_partitions" creates multiple parallel timer lists, to
> avoid having large "amounts" of transactions handled at a moment in a
> single tm-utimer task (but rather split/partition the whole of amount of
> handled transactions into smaller chunks, to be handled one at a time in
> the timer task.
>
> The "timer_workers" creates  more than one dedicated processes for
> handling the timer tasks (so scales up the timer sub-system).
>
> If you get warnings only on tm-utimer, I suspect the bottleneck is TM
> related, mainly on performing re-transmissions (that's what that task is
> doing). So the increasing the timer-partitions should be the way to help.
>
> Best regards,
>
> Bogdan-Andrei Iancu
>
> OpenSIPS Founder and Developer
>https://www.opensips-solutions.com
> OpenSIPS eBootcamp 23rd May - 3rd June 2022
>https://opensips.org/training/OpenSIPS_eBootcamp_2022/
>
> On 3/24/22 12:54 AM, Ovidiu Sas wrote:
> > Hello all,
> >
> > I'm working on tuning an opensips server. I get this pesky:
> > WARNING:core:utimer_ticker: utimer task  already scheduled
> > I was trying to get rid of them by playing with the tm
> > timer_partitions parameter and the timer_workers core param.
> > By increasing any of them doesn't increase performance.
> > By increasing both of them, it actually decreases performance.
> > The server is not at limit, the load on the UDP workers is around
> > 50-60 with some spikes.
> > I have around 3500+ cps sipp traffic.
> >
> > My understanding is that by increasing the number of timer_partitions,
> > we will have more procs walking in parallel over the timer structures.
> > If we have on timer structure, we have one proc walking over it.
> > How is this working for two timer structures? What is the difference
> > between the first and the second timer structure? Should we expect
> > less work for each proc?
> >
> > For now, to reduce the occurrence of the warning log, I increased the
> > timer interval for tm-utimer from 100ms to 200ms. This should be ok as
> > the timer has the TIMER_FLAG_DELAY_ON_DELAY flag set.
> >
> > Thanks,
> > Ovidiu
> >
>


-- 
VoIP Embedded, Inc.
http://www.voipembedded.com

___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


Re: [OpenSIPS-Users] OpenSIPS timers

2022-03-31 Thread Bogdan-Andrei Iancu

Hi Ovidiu,

As warnings from the timer_ticker, do you get only for the tm-utimer 
task ? I'm asking as the key question here is where the bottleneck is : 
in the whole "timer" subsystem, or in the tm-utimer task only?


The TM "timer_partitions" creates multiple parallel timer lists, to 
avoid having large "amounts" of transactions handled at a moment in a 
single tm-utimer task (but rather split/partition the whole of amount of 
handled transactions into smaller chunks, to be handled one at a time in 
the timer task.


The "timer_workers" creates  more than one dedicated processes for 
handling the timer tasks (so scales up the timer sub-system).


If you get warnings only on tm-utimer, I suspect the bottleneck is TM 
related, mainly on performing re-transmissions (that's what that task is 
doing). So the increasing the timer-partitions should be the way to help.


Best regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  https://www.opensips-solutions.com
OpenSIPS eBootcamp 23rd May - 3rd June 2022
  https://opensips.org/training/OpenSIPS_eBootcamp_2022/

On 3/24/22 12:54 AM, Ovidiu Sas wrote:

Hello all,

I'm working on tuning an opensips server. I get this pesky:
WARNING:core:utimer_ticker: utimer task  already scheduled
I was trying to get rid of them by playing with the tm
timer_partitions parameter and the timer_workers core param.
By increasing any of them doesn't increase performance.
By increasing both of them, it actually decreases performance.
The server is not at limit, the load on the UDP workers is around
50-60 with some spikes.
I have around 3500+ cps sipp traffic.

My understanding is that by increasing the number of timer_partitions,
we will have more procs walking in parallel over the timer structures.
If we have on timer structure, we have one proc walking over it.
How is this working for two timer structures? What is the difference
between the first and the second timer structure? Should we expect
less work for each proc?

For now, to reduce the occurrence of the warning log, I increased the
timer interval for tm-utimer from 100ms to 200ms. This should be ok as
the timer has the TIMER_FLAG_DELAY_ON_DELAY flag set.

Thanks,
Ovidiu




___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


[OpenSIPS-Users] OpenSIPS timers

2022-03-23 Thread Ovidiu Sas
Hello all,

I'm working on tuning an opensips server. I get this pesky:
WARNING:core:utimer_ticker: utimer task  already scheduled
I was trying to get rid of them by playing with the tm
timer_partitions parameter and the timer_workers core param.
By increasing any of them doesn't increase performance.
By increasing both of them, it actually decreases performance.
The server is not at limit, the load on the UDP workers is around
50-60 with some spikes.
I have around 3500+ cps sipp traffic.

My understanding is that by increasing the number of timer_partitions,
we will have more procs walking in parallel over the timer structures.
If we have on timer structure, we have one proc walking over it.
How is this working for two timer structures? What is the difference
between the first and the second timer structure? Should we expect
less work for each proc?

For now, to reduce the occurrence of the warning log, I increased the
timer interval for tm-utimer from 100ms to 200ms. This should be ok as
the timer has the TIMER_FLAG_DELAY_ON_DELAY flag set.

Thanks,
Ovidiu

-- 
VoIP Embedded, Inc.
http://www.voipembedded.com

___
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users