Hi again,

I think, I have found a fix for this. With the following two patches applied to 
NetworkManager 1.12.2, the issue seems to be gone (final confirmation pending):

https://gitlab.freedesktop.org/NetworkManager/NetworkManager/commit/1913a4d2590d42413b04c0cf86321286bb2e2dbd
 
<https://gitlab.freedesktop.org/NetworkManager/NetworkManager/commit/1913a4d2590d42413b04c0cf86321286bb2e2dbd>
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/commit/3392c439582b4ba5da19423fbaeed5843aeae9a9
 
<https://gitlab.freedesktop.org/NetworkManager/NetworkManager/commit/3392c439582b4ba5da19423fbaeed5843aeae9a9>

I suspect, the latter made the difference because pppd basically hangs when the 
bus method call doesn’t return.

Sven

> On 23 Oct 2018, at 11:37, Sven Schwermer <s...@svenschwermer.de> wrote:
> 
> Hi Aleksander, Matthew,
> 
> I looked into this and the issue that Aleksander referred to in the other 
> thread (https://bugzilla.gnome.org/show_bug.cgi?id=734347 
> <https://bugzilla.gnome.org/show_bug.cgi?id=734347>). In the issue, 
> Aleksander mentions that MM needs to be informed about the broken connection 
> after pppd has terminated so that the CLOCAL setting is restored when MM 
> interacts with the TTY again. However, if I interpret my logs correctly, MM 
> detects itself that the connection is broken based on the modem’s response to 
> AT+CGACT? (+CGACT: 1,0). This then triggers the notification signal 
> MM_PORT_CONNECTED (with connected=false) which makes port_connected try to 
> acquire the serial port lock which then fails.
> 
> Sven
> 
>> On 22 Oct 2018, at 16:52, Matthew Starr <mst...@hedonline.com 
>> <mailto:mst...@hedonline.com>> wrote:
>> 
>> Aleksander,
>> 
>> This is the exact same issue I reported a couple months ago:
>> https://lists.freedesktop.org/archives/modemmanager-devel/2018-August/006593.html
>>  
>> <https://lists.freedesktop.org/archives/modemmanager-devel/2018-August/006593.html>
>> 
>> I am currently working around it by adding a NetworkManager dispatcher 
>> script that on ppp0 going down, I restart ModemManager.
>> 
>> I never did find a solution to the issue even after trying some 
>> modifications to the serial force close code (src/mm-port-serial.c) in 
>> ModemManager.  It looks like the force_closed boolean gets set and is never 
>> reset, which causes the serial port to never retry communications over the 
>> serial port again.
>> 
>> Regards,
>> Matthew Starr
>> 
>>> Date: Mon, 22 Oct 2018 11:39:37 +0200
>>> From: Sven Schwermer <s...@svenschwermer.de <mailto:s...@svenschwermer.de>>
>>> To: Aleksander Morgado <aleksan...@aleksander.es 
>>> <mailto:aleksan...@aleksander.es>>
>>> Cc: modemmanager-devel@lists.freedesktop.org 
>>> <mailto:modemmanager-devel@lists.freedesktop.org>
>>> Subject: Re: Infinite connection loop
>>> Message-ID: <41689DBE-1755-4F85-BFA7-
>>> 97cf6e588...@svenschwermer.de <mailto:97cf6e588...@svenschwermer.de>>
>>> Content-Type: text/plain; charset="utf-8"
>>> 
>>> I looked a little closer into this now.
>>> 
>>> Here’s what I think is happening:
>>> 1. My cell operator terminates the LCP after some time of inactivity. This 
>>> is
>>> confirmed by my operator.
>>> 2. pppd receives the termination request, acknowledges it and the modem
>>> hangs up the connection.
>>> 3. ModemManager fails to re-acquire serial port lock because of the hangup.
>>> 4. ModemManager sets the connection state to ‘registered’.
>>> 5. ModemManager force-closes the serial port.
>>> 6. NetworkManager registers the lost connection and tries to re-establish 
>>> it.
>>> 7. The ModemManager fails to re-open serial port because it had been force-
>>> closed.
>>> 
>>> pppd doesn’t appear to be re-launched. How is this all supposed to work?
>>> I’m sure the software stack should be able to handle a terminated
>>> connection and re-dial it?
>>> 
>>> Please find attached the debug-level log. The LCP termination happens at Oct
>>> 21 14:53:51.
>>> 
>>> Best regards,
>>> Sven
>>> 
>>> -------------- next part --------------
>>> A non-text attachment was scrubbed...
>>> Name: excerpt.long.gz
>>> Type: application/x-gzip
>>> Size: 53662 bytes
>>> Desc: not available
>>> URL: <https://lists.freedesktop.org/archives/modemmanager- 
>>> <https://lists.freedesktop.org/archives/modemmanager->
>>> devel/attachments/20181022/1e9e348d/attachment.bin>
>>> -------------- next part --------------
>>> 
>>> 
>>>> On 20 Oct 2018, at 16:26, Sven Schwermer <s...@svenschwermer.de 
>>>> <mailto:s...@svenschwermer.de>>
>>> wrote:
>>>> 
>>>> Hi again,
>>>> 
>>>> I now managed to capture the event in the journal. I limited the auto
>>> connect retried to 4 in order to keep the log size manageable. I didn’t have
>>> debug-level logging active for NetworkManager at the time. I’ll enable that
>>> now and try to reproduce. Please find the log attached. The error seems to
>>> be happening at Oct 20 07:44:26.
>>>> 
>>>> Best regards,
>>>> Sven
>>>> 
>>>> <ed6e56cf2ac5b6bdb2e3ca97276ebd6d.journal.log>
>>>> 
>>>>> On 20 Oct 2018, at 13:47, Aleksander Morgado
>>> <aleksan...@aleksander.es <mailto:aleksan...@aleksander.es>> wrote:
>>>>> 
>>>>> Hey,
>>>>> 
>>>>>> I am running an embedded system with a u-blox SARA-U270 modem and
>>> NetworkManager 1.12.2 / ModemManager 1.8.0. For some reason, the
>>> modem connection seems to break over time. I am not quite sure why that’s
>>> the case, but when I look into the journal, I see NetworkManager and
>>> ModemManager spewing out log messages at a very high rate. This actually
>>> causes significant system load (mostly by dbus-daemon). Here’s an excerpt
>>> of the messages:
>>>>>> 
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect started...
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (4/8): Wait to get fully enabled
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (5/8): Register
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (6/8): Bearer
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect started...
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (4/8): Wait to get fully enabled
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (5/8): Register
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (6/8): Bearer
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <warn>
>>> [1539780869.7776] modem-broadband[ttyACM0]: failed to connect modem:
>>> Could not open serial device ttyACM0: it has been forced close
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.7778] device (ttyACM0): state change: prepare -> failed (reason
>>> 'unknown', sys-iface-state: 'managed')
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.7807] manager: NetworkManager state is now
>>> CONNECTED_LOCAL
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <warn>
>>> [1539780869.7840] device (ttyACM0): Activation: failed for connection
>>> 'gprsconnection'
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.7860] device (ttyACM0): state change: failed -> disconnected
>>> (reason 'none', sys-iface-state: 'managed')
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.7923] policy: auto-activating connection 'gprsconnection'
>>> (14e02637-0ffb-3a28-806f-ad60904afe7f)
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.8005] device (ttyACM0): Activation: starting connection
>>> 'gprsconnection' (14e02637-0ffb-3a28-806f-ad60904afe7f)
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.8015] device (ttyACM0): state change: disconnected -> prepare
>>> (reason 'none', sys-iface-state: 'managed')
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.8043] manager: NetworkManager state is now CONNECTING
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect started...
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (4/8): Wait to get fully enabled
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (5/8): Register
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (6/8): Bearer
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect started...
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (4/8): Wait to get fully enabled
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (5/8): Register
>>>>>> Oct 17 12:54:29 cf352c1 ModemManager[1385]: Simple connect state
>>> (6/8): Bearer
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <warn>
>>> [1539780869.9321] modem-broadband[ttyACM0]: failed to connect modem:
>>> Could not open serial device ttyACM0: it has been forced close
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.9323] device (ttyACM0): state change: prepare -> failed (reason
>>> 'unknown', sys-iface-state: 'managed')
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.9350] manager: NetworkManager state is now
>>> CONNECTED_LOCAL
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <warn>
>>> [1539780869.9399] device (ttyACM0): Activation: failed for connection
>>> 'gprsconnection'
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.9430] device (ttyACM0): state change: failed -> disconnected
>>> (reason 'none', sys-iface-state: 'managed')
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.9490] policy: auto-activating connection 'gprsconnection'
>>> (14e02637-0ffb-3a28-806f-ad60904afe7f)
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.9569] device (ttyACM0): Activation: starting connection
>>> 'gprsconnection' (14e02637-0ffb-3a28-806f-ad60904afe7f)
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.9585] device (ttyACM0): state change: disconnected -> prepare
>>> (reason 'none', sys-iface-state: 'managed')
>>>>>> Oct 17 12:54:29 cf352c1 NetworkManager[677]: <info>
>>> [1539780869.9653] manager: NetworkManager state is now CONNECTING
>>>>>> 
>>>>>> It continues like that forever. My modem network connection config is
>>> as follows:
>>>>>> 
>>>>>> [connection]
>>>>>> id=gprsconnection
>>>>>> type=gsm
>>>>>> autoconnect=true
>>>>>> autoconnect-retries=0
>>>>>> [gsm]
>>>>>> number=*99#
>>>>>> [ipv4]
>>>>>> dns-search=
>>>>>> method=auto
>>>>>> may-fail=false
>>>>>> 
>>>>>> I have been looking through the code and what I don’t understand is
>>> why the ttyACM0 would ever be forcefully closed. What could be an
>>> explanation for this?
>>>>>> 
>>>>>> Any insight is appreciated.
>>>>>> 
>>>>> 
>>>>> Do you have a full log of when this issue happens? If you can
>>>>> reproduce this more or less frequently, could you setup both
>>>>> ModemManager and NetworkManager to dump debug logs and try to get
>>> it
>>>>> reproduced? This issue could be to a wrong synchronization between MM
>>>>> and NM/pppd, in particular related to the CLOCAL settings in each case
>>>>> (i.e. MM shouldn't get HUPs in the serial port unless for the cases
>>>>> where the port is going away).
>>>>> 
>>>>> --
>>>>> Aleksander
>>>>> https://aleksander.es <https://aleksander.es/>

_______________________________________________
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel

Reply via email to