Re: How to recover from serial device being force closed.

2024-04-09 Thread Aleksander Morgado
Hey!

>
> I completely understand why there may be concerns at editing such critical 
> code for a use case that is not very common.
>
> Unfortunately my current situation requires renaming the interfaces. Multiple 
> modems may take turns being 'selected' (having the interface renamed) to pass 
> traffic. I hope to remove that need one day, but until then I will need this 
> patch.
>

So you have a modem that is in connected state but isn't selected to
pass traffic, and when it gets selected, the net interface is renamed
to some other name? I think it is the first time I've seen something
like this :) Is it because some other app uses a hardcoded net
interface name somewhere or something like that?

> Even if this is an uncommon problem, I still think this change (or at least 
> my approach) could be useful to pull into ModemManager mainline. Yes, people 
> don't rename interfaces often, but it is something that Linux/udev support, 
> these events effectively breaks the Modem object lifecycle in ModemManager 
> (which makes ModemManager look buggy), the log messages that are printed out 
> do not make it clear that renaming the interface caused it (it took my a long 
> time to figure it out), and I have not seen any piece of documentation 
> warning that renaming any interface managed by ModemManager is unsupported.
>

Technically it is completely possible, but my worry is that it may
introduce an amount of complexity that is not worth the effort.
Updating the documentation to say that MM currently does not support
interface renaming once the device has been initially probed would be
quicker... and maybe we should do that right away even before thinking
about supporting your usecase.

> You do bring up an interesting point I had missed: what if someone renamed a 
> control interface? My code currently would not catch that, but maybe it 
> should!
>

Renaming the control interface would be overkill. Renaming the net
port may be somewhat easier.

> ModemManager handles tons of events and states for modems and all the weird 
> things they can do. If a user could do something with the ModemManager API 
> that caused strange errors like I described, the issue would be fixed in the 
> driver/etc.
>
> Linux Interfaces can be renamed using the Linux API, and while they are 
> renamed, ModemManager is in a precarious state until the interface is changed 
> back to its original name. Removing or resetting the modem in that precarious 
> state will cause ModemManager to act strangely for that modem until it is 
> removed again with the original interface name (which could be a problem if 
> another modem was added while the interface was renamed). And linux+udev 
> gives us all the tools to handle these known and supported events. I 
> personally think these are enough reasons to address this behavior.
>
> I see this as an issue of stability, reliability, and resiliency. Even if 
> this behavior is rarely encountered, I believe there are substantial reasons 
> to fix it. That does not mean that my code is the best way to do it. If you 
> end up agreeing that this behavior should be fixed, I will make revisions to 
> address your concerns.
>


MM also has a lot of limitations, and one of the limitations that I
always warn about is that the state of the modem should not be touched
out of MM's own APIs. E.g. if you run QMI commands that modify the
state of the modem using qmicli, MM may not pick them up. If the state
needs to be updated, it should be done using MM's own APIs. I think we
can say the same about this problem, you're trying to modify the state
of how the modem is exposed in the kernel out of MM's eyes, and expect
MM to pick that up. Technically both things could be possible, but it
may really be too much complexity introduced for little gain,
especially when the issues can be avoided in other ways :/

> To the complexity of my code, I tried to made it pretty straight forward and 
> self contained. When a rename event happens, find the Modem object associated 
> with the old device name, remove the old name from the modem, and add the new 
> name. I could make the replacement logic into a function to keep the code 
> more readable if you would prefer.
>

Aren't we fully re-probing the modem from scratch upon the net
interface rename? We could try to do that as an initial step, in the
same way as we do the forced reprobing when e.g. the proxy dies or
when the SIM card is switched. The reprobing logic will make MM
re-detect the modem object with the new state.

> You mention opening the door to more complex interactions, and that is 
> interesting. If I don't correctly handle the udev events, then I am not 
> really fixing anything!
>
>  I just looked up all of the udev event types: add, remove, change, move, 
> online, offline, bind, and unbind.
>
> ModemManager already has support for add, remove, change (kind of, more on 
> that later), and with my patch move (I would argue that the upstream version 
> of 

Re: How to recover from serial device being force closed.

2024-03-27 Thread Jessy Exum
Sorry for the late reply,

I completely understand why there may be concerns at editing such critical
code for a use case that is not very common.

Unfortunately my current situation requires renaming the interfaces.
Multiple modems may take turns being 'selected' (having the interface
renamed) to pass traffic. I hope to remove that need one day, but until
then I will need this patch.

Even if this is an uncommon problem, I still think this change (or at least
my approach) could be useful to pull into ModemManager mainline. Yes,
people don't rename interfaces often, but it is something that Linux/udev
support, these events effectively breaks the Modem object lifecycle in
ModemManager (which makes ModemManager look buggy), the log messages that
are printed out do not make it clear that renaming the interface caused it
(it took my a long time to figure it out), and I have not seen any piece of
documentation warning that renaming any interface managed by ModemManager
is unsupported.

You do bring up an interesting point I had missed: what if someone renamed
a control interface? My code currently would not catch that, but maybe it
should!

ModemManager handles tons of events and states for modems and all the weird
things they can do. If a user could do something with the ModemManager API
that caused strange errors like I described, the issue would be fixed in
the driver/etc.

Linux Interfaces can be renamed using the Linux API, and while they are
renamed, ModemManager is in a precarious state until the interface is
changed back to its original name. Removing or resetting the modem in that
precarious state will cause ModemManager to act strangely for that modem
until it is removed again with the original interface name (which could be
a problem if another modem was added while the interface was renamed). And
linux+udev gives us all the tools to handle these known and supported
events. I personally think these are enough reasons to address this
behavior.

I see this as an issue of stability, reliability, and resiliency. Even if
this behavior is rarely encountered, I believe there are substantial
reasons to fix it. That does not mean that my code is the best way to do
it. If you end up agreeing that this behavior should be fixed, I will make
revisions to address your concerns.

To the complexity of my code, I tried to made it pretty straight forward
and self contained. When a rename event happens, find the Modem object
associated with the old device name, remove the old name from the modem,
and add the new name. I could make the replacement logic into a function to
keep the code more readable if you would prefer.

You mention opening the door to more complex interactions, and that is
interesting. If I don't correctly handle the udev events, then I am not
really fixing anything!

 I just looked up all of the udev event types: *add*, *remove*, *change*,
*move*, *online*, *offline*, *bind*, and *unbind*.

ModemManager already has support for *add*, *remove*, *change* (kind of,
more on that later), and with my patch *move* (I would argue that the
upstream version of the move handler was incorrect).

The *online* and *offline* events are supposedly for things like a network
interface coming up or going down, so I do not think these events will be
relevant to ModemManager at all. The *bind* and *unbind* events are for
when drivers are attached or detached from the device, which I also believe
is useless to ModemManager and could not affect anything with the Modem
lifecycle.

But the *change* event is interesting. I am getting contradicting
descriptions from different sources about *change* vs *move*. Perhaps you
are right that this could cause similar issues. I will probably need to
rework this to properly handle both in case *change* can be triggered in a
rename, but either way, more research is required.

I may also be double handling *move*, which could be handled more
gracefully.

I will try to fix these up in the coming weeks or months as time allows. I
welcome feedback and ideas even before I finish the patch.

Thanks as usual to the ModemManager team for building such a useful tool,
Jessy Diamond Exum

On Tue, Jan 30, 2024 at 2:44 AM Aleksander Morgado <
aleksande...@chromium.org> wrote:

> Hey,
>
> > In my application, I use ModemManager to connect the modem, and then I
> rename the network interface to a special name that is required by another
> part of my system (and change the name back if the modem disconnects).
>
> Is it not possible to rename the interface before even MM tries to grab
> the port?
>
> We could support a patch like yours, but it seems overly complicated and
> may open the door to even more complex interactions, e.g. what if someone
> comes with the idea of renaming the control port instead of the net port.
> I'm sure it's technically possible to do all that, but not sure how much
> benefit it will bring.
>
> --
> Aleksander
>


Re: How to recover from serial device being force closed.

2024-01-30 Thread Aleksander Morgado
Hey,

> In my application, I use ModemManager to connect the modem, and then I
rename the network interface to a special name that is required by another
part of my system (and change the name back if the modem disconnects).

Is it not possible to rename the interface before even MM tries to grab the
port?

We could support a patch like yours, but it seems overly complicated and
may open the door to even more complex interactions, e.g. what if someone
comes with the idea of renaming the control port instead of the net port.
I'm sure it's technically possible to do all that, but not sure how much
benefit it will bring.

-- 
Aleksander


Re: How to recover from serial device being force closed.

2023-11-15 Thread Jessy Exum
I think I figured out why the `forced close` issue is happening to me!
Short answer: network interface renaming.

In my application, I use ModemManager to connect the modem, and then I
rename the network interface to a special name that is required by another
part of my system (and change the name back if the modem disconnects).

I added some additional logging to ModemManager's hardware detection code
and was watching the udev events as the modem was rebooted (comparing when
the modem is properly torn down and when it is not). I noticed that when a
network device is removed (like the CDC interfaces on the modem) the
interface's current name is listed in the remove event (which makes sense),
but ModemManager does not update the name of the network interfaces
internally, so when a remove uevent comes through with the renamed
interface name, ModemManager does not realize it is one of the port that is
associated with the modem. And because in this situation there is always
one 'port' on the modem (even though it has actually been removed), the
modem is not disposed of.

 modem reset caused zombie modem ===

[8627]:  [1700034180.178810] [ttyACM1/at] unexpected port hangup!
[8627]:  [1700034180.178885] [ttyACM1/at] forced to close port
[8627]:  [1700034180.178923] [ttyACM1/at] device open count is 0
(close)
[8627]:  [1700034180.178969] [ttyACM1/at] closing serial port...
[8627]:  [1700034180.183019] [ttyACM1/at] serial port closed
[8627]:   [1700034180.183358] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(remove) "tty", "ttyACM0"
[8627]:   [1700034180.183411] [base-manager] EXTRALOG
MMBaseManager::device_removed("tty", "ttyACM0"
[8627]:   [1700034180.183468] [base-manager] port ttyACM0 released by
device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1'
[8627]:   [1700034180.185135] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(remove) "tty", "ttyACM1"
[8627]:   [1700034180.185214] [base-manager] EXTRALOG
MMBaseManager::device_removed("tty", "ttyACM1"
[8627]:   [1700034180.185271] [base-manager] port ttyACM1 released by
device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1'
[8627]:   [1700034180.190975] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(remove) "tty", "ttyACM2"
[8627]:   [1700034180.191056] [base-manager] EXTRALOG
MMBaseManager::device_removed("tty", "ttyACM2"
[8627]:   [1700034180.19] [base-manager] port ttyACM2 released by
device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1'
[8627]:   [1700034180.193025] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(remove) "tty", "ttyACM4"
[8627]:   [1700034180.193099] [base-manager] EXTRALOG
MMBaseManager::device_removed("tty", "ttyACM4"
[8627]:   [1700034180.193150] [base-manager] port ttyACM4 released by
device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1'
[8627]:   [1700034180.197629] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(remove) "tty", "ttyACM3"
[8627]:   [1700034180.197713] [base-manager] EXTRALOG
MMBaseManager::device_removed("tty", "ttyACM3"
[8627]:   [1700034180.197768] [base-manager] port ttyACM3 released by
device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1'
[8627]:   [1700034180.224472] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(remove) "net", "cell_cdc"
[8627]:   [1700034180.224667] [base-manager] EXTRALOG
MMBaseManager::device_removed("net", "cell_cdc"
[8627]:   [1700034180.253334] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(remove) "net", "usb1"
[8627]:   [1700034180.253410] [base-manager] EXTRALOG
MMBaseManager::device_removed("net", "usb1"
[8627]:   [1700034180.253462] [base-manager] port usb1 released by
device '/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1'

=== modem boot ==

[8627]:   [1700034187.060079] [base-manager] EXTRALOG
MMBaseManager::handle_uevent(add) "tty", "ttyACM1"
[8627]:  [1700034187.060497] [ttyACM1] port contents loaded (EXTRALOG
DEVICE UDEV):
[8627]:  [1700034187.060540] [ttyACM1]   bus: usb
[8627]:  [1700034187.060594] [ttyACM1]   interface:
/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1/5-1:1.2
[8627]:  [1700034187.060631] [ttyACM1]   device:
/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1
[8627]:  [1700034187.060667] [ttyACM1]   driver: cdc_acm
[8627]:  [1700034187.060700] [ttyACM1]   vendor: 1e2d
[8627]:  [1700034187.060733] [ttyACM1]   product: 0061
[8627]:  [1700034187.060767] [ttyACM1]   revision: 0232
[8627]:   [1700034187.060802] [base-manager] EXTRALOG
MMBaseManager::device_added("ttyACM1", 1, 0)
[8627]:  [1700034187.060841] [base-manager] adding port ttyACM1 at
sysfs path:
/sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1/5-1:1.2/tty/ttyACM1
[8627]:  [1700034187.060978] [filter] (tty/ttyACM1) port allowed:
device is allowlisted by plugin (vid)
[8627]:  [1700034187.061032] [base-manager] additional port ttyACM1
in device /sys/devices/platform/soc/310.usb3/xhci-hcd.2.auto/usb5/5-1

Re: How to recover from serial device being force closed.

2023-05-03 Thread Aleksander Morgado
Hey,

>
> it would be great if you could verify the behavior on the main branch
> (at least on the last stable 1.20.6), and give us some logs, so we can
> have a better idea of the issue and how proceed, whether in a general
> manner, or with some specific fixes.
>

Yes, especially since several bugfixes were done that could help solve
the problem (although not sure if they were done in 1.20 or earlier
really)

> >
> > I have encountered an issue while running ModemManager (version 1.18.4) on 
> > a Cinterion PLAS8 modem. Sometimes, the serial device used for AT 
> > communication with the modem is force closed (due to G_IO_HUP) without any 
> > obvious reason.
> >

Also, are you using NetworkManager along with ModemManager? Or are you
running pppd yourself? You must ensure that MM only touches the port
once pppd has exited, otherwise they'll use mismatched CLOCAL flags
and MM may get a HUP when the modem got disconnected from the network
(as opposed to getting a HUP only when the port goes away from the
system)

-- 
Aleksander


Re: How to recover from serial device being force closed.

2023-05-02 Thread Giacinto Cifelli
Hello Jessy,

it would be great if you could verify the behavior on the main branch
(at least on the last stable 1.20.6), and give us some logs, so we can
have a better idea of the issue and how proceed, whether in a general
manner, or with some specific fixes.

Giacinto




On Mon, May 1, 2023 at 8:47 AM Jessy Exum  wrote:
>
> Hello ModemManager devs,
>
> I have encountered an issue while running ModemManager (version 1.18.4) on a 
> Cinterion PLAS8 modem. Sometimes, the serial device used for AT communication 
> with the modem is force closed (due to G_IO_HUP) without any obvious reason.
>
> Despite the modem being unreachable over the dead connection, the modem 
> remains listed in mmcli (without being marked as having an error), and any 
> ModemManager command that would send an AT command to the modem fails with an 
> error similar to:
>
> GDBus.Error:org.freedesktop.ModemManager1.Error.Core.Connected: Cannot run 
> sequence: 'Could not open serial device ttyACM1: it has been forced close'
>
>
> When I first saw this issue, I thought it was the modem acting up (it would 
> not be the first time), but I found that restarting ModemManager recovers the 
> modem. This tells me that the modem was still functioning, and ModemManager 
> was unable (or never tried) to re-open the channel. I have not had this error 
> occur on a device with ModemManager debugging enabled, so I can not be sure 
> if reconnection is attempted (I will try to find a way to manually trigger 
> this issue so I can get better debugging).
>
> If I could reset the modem, that should cause it to be re-initiated in 
> ModemManger and likely fix the issue. But as far as I know, there is no way 
> to have ModemManager re-enumerate a modem without removing the device 
> (physically, through a modem reset, etc). Unfortunately, I am also unable to 
> recover the modem through ModemManager because a modem reset requires the 
> ability to send AT commands to the modem. Currently, the only solutions I 
> have found to recover the modem is to physically power cycle the modem (which 
> is not always possible), or to stop and restart ModemManager (inconvenient if 
> there are other modems). I may also be able to use an external script to 
> detect ModemManager getting into this state, and directly send a reset 
> command to the modem over the AT channel (without ModemManager), but I would 
> prefer if ModemManager was able to recover this on its own.
>
> The error occurs rarely/randomly enough that I am not interested in chasing 
> down why this happens (it could be the modem being flakey, electrical 
> interference, whatever). I am more interested in being able to gracefully 
> recover from this situation, so I want to know what ModemManager should be 
> doing so I can try to help fix it.
>
> Any suggestions or information would be greatly appreciated.
>
> Thanks for all the hard work in this useful tool,
> Jessy Diamond Exum
>


How to recover from serial device being force closed.

2023-05-01 Thread Jessy Exum
Hello ModemManager devs,

I have encountered an issue while running ModemManager (version 1.18.4) on
a Cinterion PLAS8 modem. *Sometimes, the serial device used for AT
communication with the modem is force closed (due to G_IO_HUP) without any
obvious reason.*

Despite the modem being unreachable over the dead connection, the modem
remains listed in mmcli (without being marked as having an error), and any
ModemManager command that would send an AT command to the modem fails with
an error similar to:

GDBus.Error:org.freedesktop.ModemManager1.Error.Core.Connected: Cannot
run sequence: 'Could not open serial device ttyACM1: it has been
forced close'


When I first saw this issue, I thought it was the modem acting up (it would
not be the first time), but I found that restarting ModemManager recovers
the modem. This tells me that the modem was still functioning, and
ModemManager was unable (or never tried) to re-open the channel. I have not
had this error occur on a device with ModemManager debugging enabled, so I
can not be sure if reconnection is attempted (I will try to find a way to
manually trigger this issue so I can get better debugging).

If I could reset the modem, that should cause it to be re-initiated in
ModemManger and likely fix the issue. But as far as I know, there is no way
to have ModemManager re-enumerate a modem without removing the device
(physically, through a modem reset, etc). Unfortunately, I am also unable
to recover the modem through ModemManager because a modem reset requires
the ability to send AT commands to the modem. Currently, the only solutions
I have found to recover the modem is to physically power cycle the modem
(which is not always possible), or to stop and restart ModemManager
(inconvenient if there are other modems). I may also be able to use an
external script to detect ModemManager getting into this state, and
directly send a reset command to the modem over the AT channel (without
ModemManager), but I would prefer if ModemManager was able to recover this
on its own.

The error occurs rarely/randomly enough that I am not interested in chasing
down why this happens (it could be the modem being flakey, electrical
interference, whatever). I am more interested in being able to gracefully
recover from this situation, so *I want to know what ModemManager should be
doing so I can try to help fix it.*

Any suggestions or information would be greatly appreciated.

Thanks for all the hard work in this useful tool,
Jessy Diamond Exum