> On Nov. 25, 2013, 6:54 p.m., rmudgett wrote: > > /branches/12/main/core_local.c, lines 682-685 > > <https://reviewboard.asterisk.org/r/2961/diff/4/?file=48509#file48509line682> > > > > Revert all the changes you've made to local_call() and reacquire the > > locks in publish_local_bridge_message() using ast_unreal_lock_all(). As it > > is you have paths that leave channels locked. > > Mark Michelson wrote: > I see one path that leaves the channels locked. Why not just correct that > one path? It seems wasteful to reacquire the locks this way when the parent > function that calls publish_local_bridge_message() has already done the work > to acquire the locks once.
It is simpler to verify that the locks are correct if publish_local_bridge_message() does its own locking. - rmudgett ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviewboard.asterisk.org/r/2961/#review10273 ----------------------------------------------------------- On Nov. 19, 2013, 11:17 p.m., Mark Michelson wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviewboard.asterisk.org/r/2961/ > ----------------------------------------------------------- > > (Updated Nov. 19, 2013, 11:17 p.m.) > > > Review request for Asterisk Developers. > > > Bugs: ASTERISK-22709 > https://issues.asterisk.org/jira/browse/ASTERISK-22709 > > > Repository: Asterisk > > > Description > ------- > > John Bigelow discovered that if the newly-created atxfer_threeway_nominal > test is run in a loop, eventually a test run will cause the UUT instance of > Asterisk to crash. Investigation of the crash showed that the crash was being > caused while trying to get the string representation of a channel's > translation paths while creating a channel snapshot. The attended transfer > manager thread was creating the channel snapshot as part of the attended > transfer stasis message while in another thread (in this case, the bridge > channel thread) the translation paths were being altered to make the channel > compatible with others in the bridge. > > It's clear that the issue is that the channel's contents are not lock > protected while creating a channel snapshot. Luckily, the channel's lock is > held already when updating translation paths. By adding locking, I was able > to run the test over 450 times straight with no negative consequences. > > When adding locking, there were two strategies to follow: > > 1) Take advantage of the fact that channel locks are recursive, and just add > a single SCOPED_CHANNELLOCK() call to ast_channel_snapshot_create() and call > it a day. > 2) Make an audit of channel-related calls that end up creating a channel > snapshot and require locks be held prior to making those calls. > > I like approach 2 more, even though it's harder to pull off. In investigating > what paths result in channel snapshots being created, I found that channel > locks should be required before calling the following functions: > > * ast_channel_snapshot_create() > * ast_channel_blob_create() > * ast_channel_publish_snapshot() > * ast_publish_channel_state() > * ast_channel_publish_blob() > * ast_channel_publish_varset() > * ast_channel_amaflags_set() > * ast_channel_callid_set() > * ast_channel_whentohangup_set() > * ast_channel_callgroup_set() > * ast_channel_pickupgroup_set() > * ast_channel_internal_bridge_set() > * ast_channel_language_set() > * ast_channel_accountcode_set() > * ast_channel_peeraccount_set() > * ast_channel_linkedid_set() > * ast_channel_stage_snapshot() > * ast_setstate() > * ast_aoc_manager_event() > * ast_channel_setwhentohangup_tv() > * ast_channel_setwhentohangup() > * ast_set_variables() > > All of these functions' documentation have been updated to indicate a > precondition of having the channel locked. > > In addition, there are some more complex functions that create channel > snapshots but could not easily be made to have the precondition of having the > channel locked. These are: > > * ast_bridge_blob_create() > * ast_publish_attended_transfer_fail() > * ast_publish_attended_transfer_bridge_merge() > * ast_publish_attended_transfer_threeway() > * ast_publish_attended_transfer_app() > * ast_publish_attended_transfer_link() > * ast_bridge_publish_enter() > * ast_bridge_publish_leave() > * ast_bridge_publish_blind_transfer() > > These functions' documentation now have a precondition of having no channels > or bridges locked. > > This review seeks only to make sure that channels are properly locked prior > to creating channel snapshots. A similar effort is likely needed to ensure > that bridges are locked during bridge snapshot creation. > > > Diffs > ----- > > /branches/12/tests/test_stasis_channels.c 402818 > /branches/12/tests/test_cel.c 402818 > /branches/12/tests/test_cdr.c 402818 > /branches/12/res/res_stasis.c 402818 > /branches/12/res/res_pjsip_refer.c 402818 > /branches/12/res/res_agi.c 402818 > /branches/12/res/parking/parking_manager.c 402818 > /branches/12/res/parking/parking_bridge_features.c 402818 > /branches/12/pbx/pbx_realtime.c 402818 > /branches/12/main/stasis_channels.c 402818 > /branches/12/main/stasis_bridges.c 402818 > /branches/12/main/pickup.c 402818 > /branches/12/main/pbx.c 402818 > /branches/12/main/endpoints.c 402818 > /branches/12/main/dial.c 402818 > /branches/12/main/core_unreal.c 402818 > /branches/12/main/core_local.c 402818 > /branches/12/main/channel.c 402818 > /branches/12/main/cel.c 402818 > /branches/12/main/bridge_channel.c 402818 > /branches/12/main/bridge.c 402818 > /branches/12/include/asterisk/stasis_channels.h 402818 > /branches/12/include/asterisk/stasis_bridges.h 402818 > /branches/12/include/asterisk/channelstate.h 402818 > /branches/12/include/asterisk/channel.h 402818 > /branches/12/include/asterisk/aoc.h 402818 > /branches/12/funcs/func_timeout.c 402818 > /branches/12/channels/sig_pri.c 402818 > /branches/12/channels/sig_analog.c 402818 > /branches/12/channels/chan_vpb.cc 402818 > /branches/12/channels/chan_unistim.c 402818 > /branches/12/channels/chan_skinny.c 402818 > /branches/12/channels/chan_sip.c 402818 > /branches/12/channels/chan_pjsip.c 402818 > /branches/12/channels/chan_phone.c 402818 > /branches/12/channels/chan_oss.c 402818 > /branches/12/channels/chan_nbs.c 402818 > /branches/12/channels/chan_motif.c 402818 > /branches/12/channels/chan_misdn.c 402818 > /branches/12/channels/chan_mgcp.c 402818 > /branches/12/channels/chan_jingle.c 402818 > /branches/12/channels/chan_iax2.c 402818 > /branches/12/channels/chan_h323.c 402818 > /branches/12/channels/chan_gtalk.c 402818 > /branches/12/channels/chan_dahdi.c 402818 > /branches/12/channels/chan_console.c 402818 > /branches/12/channels/chan_alsa.c 402818 > /branches/12/apps/app_voicemail.c 402818 > /branches/12/apps/app_userevent.c 402818 > /branches/12/apps/app_queue.c 402818 > /branches/12/apps/app_meetme.c 402818 > /branches/12/apps/app_disa.c 402818 > /branches/12/apps/app_dial.c 402818 > /branches/12/apps/app_confbridge.c 402818 > /branches/12/apps/app_agent_pool.c 402818 > /branches/12/addons/chan_ooh323.c 402818 > /branches/12/addons/chan_mobile.c 402818 > > Diff: https://reviewboard.asterisk.org/r/2961/diff/ > > > Testing > ------- > > As stated in the description, I ran the atxfer_threeway_nominal test in a > loop. After approximately 450 test runs, there were no negative consequences. > Prior to this changeset, running the test 20-50 times would result in a crash. > > > Thanks, > > Mark Michelson > >
-- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev