-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3568/#review12062
-----------------------------------------------------------

Ship it!


The changes look good. It's too bad we have to go back to pulling channel 
information from the channel. Pulling from the snapshots was nice in deadlock 
situations.

- opticron


On June 1, 2014, 4:16 p.m., Matt Jordan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3568/
> -----------------------------------------------------------
> 
> (Updated June 1, 2014, 4:16 p.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Bugs: ASTERISK-23811
>     https://issues.asterisk.org/jira/browse/ASTERISK-23811
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> During some performance testing of Asterisk with AGI, ARI, and lots of Local 
> channels, we noticed that there's quite a hit in performance during channel 
> creation and releasing to the dialplan (ARI continue). After investigating 
> the performance spike that occurs during channel creation, we discovered that 
> we create a lot of channel snapshots that are technically unnecessary. This 
> includes creating snapshots during:
> 
> * AGI execution
> * Returning objects for ARI commands
> * During some Local channel operations
> * During some dialling operations
> * During variable setting
> * During some bridging operations
> 
> And more.
> 
> This patch does the following:
>  - It removes a number of fields from channel snapshots. These fields were 
> rarely used, were expensive to have on the snapshot, and hurt performance. 
> This included formats, translation paths, Log Call ID, callgroup, pickup 
> group, and all channel variables. As a result, AMI Status, "core show 
> channel", "core show channelvar", and "pjsip show channel" were modified to 
> either hit the live channel or not show certain pieces of data. While this is 
> unfortunate, the performance gain from this patch is worth the loss in 
> behaviour.
>  - It adds a mechanism to publish a cached snapshot + blob. A large number of 
> publications were changed to use this, including:
>    - During Dial begin
>    - During Variable assignment (if no AMI variables are emitted - if AMI 
> variables are set, we have to make snapshots when a variable is changed)
>    - During channel pickup
>    - When a channel is put on hold/unhold
>    - When a DTMF digit is begun/ended
>    - When creating a bridge snapshot
>    - When an AOC event is raised
>    - During Local channel optimization/Local bridging
>    - When endpoint snapshots are generated
>    - All AGI events
>    - All ARI responses that return a channel
>    - Events in the AgentPool, MeetMe, and some in Queue
>  - Additionally, some extraneous channel snapshots were being made that were 
> unnecessary. These were removed.
>  - The result of ast_hashtab_hash_string is now cached in stasis_cache. This 
> reduces a large number of calls to ast_hashtab_hash_string, which reduced the 
> amount of time spent in this function in gprof by around 50%.
> 
> 
> Diffs
> -----
> 
>   /branches/12/res/res_pjsip/pjsip_configuration.c 414971 
>   /branches/12/res/res_agi.c 414971 
>   /branches/12/res/ari/resource_channels.c 414971 
>   /branches/12/main/stasis_channels.c 414971 
>   /branches/12/main/stasis_cache.c 414971 
>   /branches/12/main/stasis_bridges.c 414971 
>   /branches/12/main/pickup.c 414971 
>   /branches/12/main/pbx.c 414971 
>   /branches/12/main/manager.c 414971 
>   /branches/12/main/endpoints.c 414971 
>   /branches/12/main/dial.c 414971 
>   /branches/12/main/core_local.c 414971 
>   /branches/12/main/cli.c 414971 
>   /branches/12/main/channel_internal_api.c 414971 
>   /branches/12/main/channel.c 414971 
>   /branches/12/main/cel.c 414971 
>   /branches/12/main/bridge_channel.c 414971 
>   /branches/12/main/aoc.c 414971 
>   /branches/12/include/asterisk/stasis_channels.h 414971 
>   /branches/12/include/asterisk/channel.h 414971 
>   /branches/12/apps/app_queue.c 414971 
>   /branches/12/apps/app_meetme.c 414971 
>   /branches/12/apps/app_agent_pool.c 414971 
> 
> Diff: https://reviewboard.asterisk.org/r/3568/diff/
> 
> 
> Testing
> -------
> 
> All sorts of different testing was done. For automated tests that cover these 
> changes, see https://bamboo.asterisk.org/bamboo/browse/ATB-TEAMSP-6. Note 
> that the test failures on that branch are currently also failing on the 12 
> branch/trunk. A few new tests were written and two Parking test's CDR results 
> were tweaked (but not in a fashion that violates the spec or the intent of 
> the records) - for those changes, see: 
> https://reviewboard.asterisk.org/r/3578/
> 
> The first round of testing was profiling Asterisk with gprof to see if I 
> could reduce the amount of time spent creating channel snapshots. This was 
> tested using an ARI script that created a Local channel, subscribed to it, 
> then released that channel to the dialplan. The released Local channel would 
> Dial 10 or so Local channels, which went back into the Stasis application. 
> This increased the number of Local channels rapidly on the system - letting 
> this run for awhile (until hilarity ensued) was a good mechanism to test the 
> affects of removing channel snapshots.
> 
> Once snapshot creation was no longer easily moved down the gprof list, I went 
> ahead and did a bit of investigation on the effects of this patch on CPU 
> using just the dialplan. The fact that snapshots were getting made a lot 
> during AGI/ARI was clearly causing problems, but since dialplan execution 
> still has to create a snapshot, I was curious what effect this patch would 
> have on more "traditional" scenarios. For each of these rounds of tests, I 
> let Asterisk run with SIPp for about 3000 channels. Watching top, I recorded 
> the min/max that showed up during that period of time. Not exactly scientific 
> - but it gives a general impression of the effect of the changes.
> 
> ** Note that wherever I reference CPU, it should be noted that this was run 
> on my laptop (which is rather old) while running Firefox, Pidgin, and a whole 
> host of other stuff. I wasn't attempting to run a true performance analysis; 
> instead, I was just trying to get a relative measurement of things with and 
> without these changes. **
> 
> ** ROUND 1 **
> 
> The first used an inbound PJSIP channel that had a variety of variables set 
> on it, and then got hung up:
> 
> Dialplan:
> 
> exten => test_sip,1,NoOp()
>  same => n,Set(var1=foo)
>  same => n,Set(var2=bar)
>  same => n,Set(var3=yackity)
>  same => n,Set(var4=schmackity)
>  same => n,Answer()
>  same => n,Echo()
> 
> SIPp:
> 
> sipp 127.0.0.1:5060 -sn uac -i 127.0.0.1 -p 5061 -s test_sip -d 5000
> 
> With this patch, CPU was around 9-10%; without this patch, around 10-11%. So 
> not much difference. This isn't too surprising: there's no dialling, there's 
> no interfaces being used, and very few changes being made to the channel. 
> Dialplan execution is the same with/without this patch - the largest 
> 'savings' here is the usage of cached snapshots during variable assignment.
> 
> ** ROUND 2 **
> 
> This round threw in a Local channel and Dial operation. There's a slightly 
> larger improvement here with/without the patch:
> 
> Dialplan 2:
> 
> exten => test_dial_sip,1,NoOp()
>  same => n,Set(var1=foo)
>  same => n,Set(var2=bar)
>  same => n,Set(var3=yackity)
>  same => n,Set(var4=schmackity)
>  same => n,Dial(Local/term@default)
>  same => n,Echo()
> 
> exten => term,1,NoOp()
>  same => n,Set(var1=foo)
>  same => n,Set(var2=bar)
>  same => n,Set(var3=yackity)
>  same => n,Set(var4=schmackity)
>  same => n,Answer()
>  same => n,Echo()
> 
> With this patch, CPU was around 14-18%; without this patch, around 16-21%. 
> Saving a few snapshots during Local channel "bridging", dialling, and setting 
> variables on them shaves 2-3% off the CPU.
> 
> ** ROUND 3 **
> 
> Same as round 2, only with a parallel Dial of multiple Local channels. 
> Improvement was actually about the same as Round 2; with the patch, around 
> 19-23% CPU, without the patch, around 22-26% CPU.
> 
> Dialplan 3:
> 
> exten => test_dial_sip,1,NoOp()
>  same => n,Set(var1=foo)
>  same => n,Set(var2=bar)
>  same => n,Set(var3=yackity)
>  same => n,Set(var4=schmackity)
>  same => 
> n,Dial(Local/term@default&Local/term@default&Local/term@default&Local/term@default&Local/term@default)
>  same => n,Echo()
> 
> exten => term,1,NoOp()
>  same => n,Set(var1=foo)
>  same => n,Set(var2=bar)
>  same => n,Set(var3=yackity)
>  same => n,Set(var4=schmackity)
>  same => n,Answer()
>  same => n,Echo()
> 
> 
> There's a few things to draw from all of this:
> (1) Snapshots aren't free. If you can use a cached version, use it. If you 
> can't, stage the snapshot until all attributes have been changed (which we do 
> quite frequently; this patch did not need to modify any of that behaviour).
> (2) Snapshots should be lean. Don't put something on there that doesn't need 
> to be there. While there is benefit in not hitting live data to get 
> information (reduces the interference of inspection commands in APIs on real 
> time operations), it isn't worth hurting the performance of the overall 
> system.
> (3) We should pay attention to the cost of things in astobj2. Shifting the 
> burden from creating snapshots to cached ones means the Stasis cache - which 
> uses ao2 heavily - has to have good performance.
> (4) It would be worthwhile to see if we can't optimize creation of snapshots 
> further. Often, when we create a snapshot, we only modify one or two 
> properties on it. Usage of memory pools, fixed sized fields with memcpy, and 
> other techniques _may_ improve performance further.
> 
> 
> Thanks,
> 
> Matt Jordan
> 
>

-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to