Re: ModemManager: MMCallbackInfo scheduled twice when device removed
> All the stuff in the protect-callbacks branch looks good, please push. > I couldn't trigger crashes on a number of devices just pulling them out > in the middle of a tight loop hammering the modem for info. Good work. > This has been pushed to both git master and MM_05. Cheers, -- Aleksander ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: ModemManager: MMCallbackInfo scheduled twice when device removed
On Tue, 2011-05-17 at 09:57 +0200, Aleksander Morgado wrote: > Hi Dan, > > > > > > > > > Each MMCallbackInfo holds a weak reference to the MMModem to which the > > > > AT command was sent. When the MMModem is destroyed, > > > > mm-callback-info.c::modem_destroyed_cb() gets called and the modem > > > > pointer in the callback info is reset to NULL, to avoid having a pointer > > > > to an already disposed MMModem. See [1]. > > > > > > > > But, in addition to setting the modem pointer to NULL, an explicit error > > > > is just set directly in the callback info, and it gets scheduled. The > > > > problem here is that after destroying the modem, the ports it held also > > > > get closed as part of the destruction, and during port closing there's > > > > the task of finalizing all pending commands (see > > > > mm-serial-port.c::mm_serial_port_close()), which is done by passing an > > > > error to the specific response handler, and which will then (usually) > > > > propagate that error to the MMCallbackInfo and schedule it right away. > > > > > > > > That ends up in a memory leak (most response handlers will just set > > > > info->error without assuming there may already be a GError set); plus a > > > > critical in the second call to mm_callback_info_schedule() because a > > > > callback info can't be scheduled twice. > > > > > > > Got the new approach developed in the 'protect-callbacks' branch in the > following Gitorious repo, ready for review: > git://gitorious.org/lanedo/modemmanager.git > > It includes the following fixes: > > * A new mm_callback_info_check_modem_removed() to be used instead of > mm_modem_check_removed(). All response callbacks should use this new > method to check whether info->modem is NULL and just return if so, > without scheduling the info. > https://gitorious.org/lanedo/modemmanager/commit/a3d9298391b39445755b230144b1fc5cda5c60ed > > * A fix in modem_destroyed_cb() to ensure that ERROR_REMOVED is always > set. > https://gitorious.org/lanedo/modemmanager/commit/3acce64ad591a1a436f2ea29c85f706a2da24e4c > > * A fix in several plugins to avoid using a custom DisableInfo > structure, and use a MMCallbackInfo instead. I ended up doing this using > a custom invoke method, which calls parent disable passing the callback, > instead of the set_data()/get_data() approach. Not sure if you'll like > this :-). > https://gitorious.org/lanedo/modemmanager/commit/19395ffa8f5a5927bca2e441cd135b72248c9a15 All the stuff in the protect-callbacks branch looks good, please push. I couldn't trigger crashes on a number of devices just pulling them out in the middle of a tight loop hammering the modem for info. Good work. Dan ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: ModemManager: MMCallbackInfo scheduled twice when device removed
Hi Dan, > > > > > > Each MMCallbackInfo holds a weak reference to the MMModem to which the > > > AT command was sent. When the MMModem is destroyed, > > > mm-callback-info.c::modem_destroyed_cb() gets called and the modem > > > pointer in the callback info is reset to NULL, to avoid having a pointer > > > to an already disposed MMModem. See [1]. > > > > > > But, in addition to setting the modem pointer to NULL, an explicit error > > > is just set directly in the callback info, and it gets scheduled. The > > > problem here is that after destroying the modem, the ports it held also > > > get closed as part of the destruction, and during port closing there's > > > the task of finalizing all pending commands (see > > > mm-serial-port.c::mm_serial_port_close()), which is done by passing an > > > error to the specific response handler, and which will then (usually) > > > propagate that error to the MMCallbackInfo and schedule it right away. > > > > > > That ends up in a memory leak (most response handlers will just set > > > info->error without assuming there may already be a GError set); plus a > > > critical in the second call to mm_callback_info_schedule() because a > > > callback info can't be scheduled twice. > > > Got the new approach developed in the 'protect-callbacks' branch in the following Gitorious repo, ready for review: git://gitorious.org/lanedo/modemmanager.git It includes the following fixes: * A new mm_callback_info_check_modem_removed() to be used instead of mm_modem_check_removed(). All response callbacks should use this new method to check whether info->modem is NULL and just return if so, without scheduling the info. https://gitorious.org/lanedo/modemmanager/commit/a3d9298391b39445755b230144b1fc5cda5c60ed * A fix in modem_destroyed_cb() to ensure that ERROR_REMOVED is always set. https://gitorious.org/lanedo/modemmanager/commit/3acce64ad591a1a436f2ea29c85f706a2da24e4c * A fix in several plugins to avoid using a custom DisableInfo structure, and use a MMCallbackInfo instead. I ended up doing this using a custom invoke method, which calls parent disable passing the callback, instead of the set_data()/get_data() approach. Not sure if you'll like this :-). https://gitorious.org/lanedo/modemmanager/commit/19395ffa8f5a5927bca2e441cd135b72248c9a15 Cheers! -- Aleksander ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: ModemManager: MMCallbackInfo scheduled twice when device removed
> > > > Each MMCallbackInfo holds a weak reference to the MMModem to which the > > AT command was sent. When the MMModem is destroyed, > > mm-callback-info.c::modem_destroyed_cb() gets called and the modem > > pointer in the callback info is reset to NULL, to avoid having a pointer > > to an already disposed MMModem. See [1]. > > > > But, in addition to setting the modem pointer to NULL, an explicit error > > is just set directly in the callback info, and it gets scheduled. The > > problem here is that after destroying the modem, the ports it held also > > get closed as part of the destruction, and during port closing there's > > the task of finalizing all pending commands (see > > mm-serial-port.c::mm_serial_port_close()), which is done by passing an > > error to the specific response handler, and which will then (usually) > > propagate that error to the MMCallbackInfo and schedule it right away. > > > > That ends up in a memory leak (most response handlers will just set > > info->error without assuming there may already be a GError set); plus a > > critical in the second call to mm_callback_info_schedule() because a > > callback info can't be scheduled twice. > > > > I've been looking at this problem some time, and the best way to fix it > > seems to be to just avoid setting the first error and scheduling in > > modem_destroyed_cb(): > > > > diff --git a/src/mm-callback-info.c b/src/mm-callback-info.c > > index 1986bb5..8f8ecca 100644 > > --- a/src/mm-callback-info.c > > +++ b/src/mm-callback-info.c > > @@ -53,13 +53,9 @@ modem_destroyed_cb (gpointer data, GObject > > *destroyed) > > { > > MMCallbackInfo *info = data; > > > > +/* Just set the modem pointer to NULL, do not further process the > > + * callback info */ > > info->modem = NULL; > > -if (!info->pending_id) { > > -info->error = g_error_new_literal (MM_MODEM_ERROR, > > - MM_MODEM_ERROR_REMOVED, > > - "The modem was removed."); > > -mm_callback_info_schedule (info); > > -} > > } > > > > Comments? > > How does that play into the mm_modem_check_removed() checks which I > think are one of the more common places this error is looked for? I > think the motivation here was to be *sure* that any callback handler > waiting for the result got an error. The thing we want to ensure here > is that if a D-Bus client of MM made a request when the modem was pulled > out or crashed (which can happen pretty often) that the client gets a > reply instead of a timed-out D-Bus method return error. I'm sure > there's room for cleanup here, but that was the original motivation. Do > we just need to audit the callback handlers to make sure they do the > right thing with a ERROR_REMOVED? > You're right, removing the error setting in modem_destroyed_cb() is not the right thing to do, if we want to keep the ERROR_REMOVED logic around. I will try to explain the issue with an example flow of execution, and suggest another approach below: (1) MM sends an AT command to the modem, like the periodic ones to get signal strength. (2) Modem is disconnected, reply to the sent AT command was not received yet, so it gets disconnected in the middle of a send-receive operation. (3) MMModem object destruction starts. Due to the weak reference to the modem in the MMCallbackInfo, modem_destroyed_cb() gets called. info->modem is set to NULL, info->error is set to ERROR_REMOVED and the callback info is scheduled. (4) During modem object destruction (in the same loop iteration as step 3!), ports in the modem also get destroyed. During port destruction, all pending commands which didn't get reply are processed, and a SEND_FAILED error is passed to the response-received callback. This callback then uses mm_modem_check_removed() (well, actually not all use it), and as info->modem is NULL, it will overwrite info->error with a newly created ERROR_REMOVED error. Once the new error is set, the callback info is scheduled, again. So always ends up with a memleak and a re-schedule of the callback info. >From what I see, there are some things to fix here then: (a) Not all response-received callbacks use mm_modem_check_removed(), they should all use it. (b) The response-received callback should not overwrite info->error if already set (avoiding memleak). (c) If ERROR_REMOVED is already set in the MMCallbackInfo, the response-received callback should *not* re-schedule the callback info, it should assume it was already scheduled. Another option would be to modify mm_callback_info_schedule() to just allow being called twice, silently returning without doing anything the second time it gets called. What do you think of this? Cheers! -- Aleksander ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: ModemManager: MMCallbackInfo scheduled twice when device removed
On Wed, 2011-05-11 at 16:20 +0200, Aleksander Morgado wrote: > Hi all, > > Each MMCallbackInfo holds a weak reference to the MMModem to which the > AT command was sent. When the MMModem is destroyed, > mm-callback-info.c::modem_destroyed_cb() gets called and the modem > pointer in the callback info is reset to NULL, to avoid having a pointer > to an already disposed MMModem. See [1]. > > But, in addition to setting the modem pointer to NULL, an explicit error > is just set directly in the callback info, and it gets scheduled. The > problem here is that after destroying the modem, the ports it held also > get closed as part of the destruction, and during port closing there's > the task of finalizing all pending commands (see > mm-serial-port.c::mm_serial_port_close()), which is done by passing an > error to the specific response handler, and which will then (usually) > propagate that error to the MMCallbackInfo and schedule it right away. > > That ends up in a memory leak (most response handlers will just set > info->error without assuming there may already be a GError set); plus a > critical in the second call to mm_callback_info_schedule() because a > callback info can't be scheduled twice. > > I've been looking at this problem some time, and the best way to fix it > seems to be to just avoid setting the first error and scheduling in > modem_destroyed_cb(): > > diff --git a/src/mm-callback-info.c b/src/mm-callback-info.c > index 1986bb5..8f8ecca 100644 > --- a/src/mm-callback-info.c > +++ b/src/mm-callback-info.c > @@ -53,13 +53,9 @@ modem_destroyed_cb (gpointer data, GObject > *destroyed) > { > MMCallbackInfo *info = data; > > +/* Just set the modem pointer to NULL, do not further process the > + * callback info */ > info->modem = NULL; > -if (!info->pending_id) { > -info->error = g_error_new_literal (MM_MODEM_ERROR, > - MM_MODEM_ERROR_REMOVED, > - "The modem was removed."); > -mm_callback_info_schedule (info); > -} > } > > Comments? How does that play into the mm_modem_check_removed() checks which I think are one of the more common places this error is looked for? I think the motivation here was to be *sure* that any callback handler waiting for the result got an error. The thing we want to ensure here is that if a D-Bus client of MM made a request when the modem was pulled out or crashed (which can happen pretty often) that the client gets a reply instead of a timed-out D-Bus method return error. I'm sure there's room for cleanup here, but that was the original motivation. Do we just need to audit the callback handlers to make sure they do the right thing with a ERROR_REMOVED? Dan ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
ModemManager: MMCallbackInfo scheduled twice when device removed
Hi all, Each MMCallbackInfo holds a weak reference to the MMModem to which the AT command was sent. When the MMModem is destroyed, mm-callback-info.c::modem_destroyed_cb() gets called and the modem pointer in the callback info is reset to NULL, to avoid having a pointer to an already disposed MMModem. See [1]. But, in addition to setting the modem pointer to NULL, an explicit error is just set directly in the callback info, and it gets scheduled. The problem here is that after destroying the modem, the ports it held also get closed as part of the destruction, and during port closing there's the task of finalizing all pending commands (see mm-serial-port.c::mm_serial_port_close()), which is done by passing an error to the specific response handler, and which will then (usually) propagate that error to the MMCallbackInfo and schedule it right away. That ends up in a memory leak (most response handlers will just set info->error without assuming there may already be a GError set); plus a critical in the second call to mm_callback_info_schedule() because a callback info can't be scheduled twice. I've been looking at this problem some time, and the best way to fix it seems to be to just avoid setting the first error and scheduling in modem_destroyed_cb(): diff --git a/src/mm-callback-info.c b/src/mm-callback-info.c index 1986bb5..8f8ecca 100644 --- a/src/mm-callback-info.c +++ b/src/mm-callback-info.c @@ -53,13 +53,9 @@ modem_destroyed_cb (gpointer data, GObject *destroyed) { MMCallbackInfo *info = data; +/* Just set the modem pointer to NULL, do not further process the + * callback info */ info->modem = NULL; -if (!info->pending_id) { -info->error = g_error_new_literal (MM_MODEM_ERROR, - MM_MODEM_ERROR_REMOVED, - "The modem was removed."); -mm_callback_info_schedule (info); -} } Comments? [1] c0253c7c293148a0bdb6c20d4b38a08401d8e34d -- Aleksander ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list