bullseye-installer fails on Cubox-i due to networking issue

2020-12-24 Thread Rainer Dorsch
Hi,

I tried to run the bullseye installer from

http://ftp2.de.debian.org/debian/dists/bullseye/main/installer-armhf/current/
images/netboot/SD-card-images/

on a cubox-i using a serial console today.

It seems the network interface does not come up properly:

~ # dmesg |grep eth
[5.009246] fec 2188000.ethernet: Invalid MAC address: 00:00:00:00:00:00
[5.015982] fec 2188000.ethernet: Using random MAC address: 4a:
0d:a7:66:c1:e6
[5.028381] mdio_bus 2188000.ethernet-1: MDIO device at address 0 is 
missing.
[  138.479638] fec 2188000.ethernet eth0: Unable to connect to phy
[  139.674014] fec 2188000.ethernet eth0: Unable to connect to phy
[  141.218830] fec 2188000.ethernet eth0: Unable to connect to phy
[  147.400881] fec 2188000.ethernet eth0: Unable to connect to phy
[  199.375688] fec 2188000.ethernet eth0: Unable to connect to phy
[  736.031852] fec 2188000.ethernet eth0: Unable to connect to phy
[  906.069383] fec 2188000.ethernet eth0: Unable to connect to phy
[ 1156.891662] fec 2188000.ethernet eth0: Unable to connect to phy
[ 1266.982998] fec 2188000.ethernet eth0: Unable to connect to phy
~ # 

Not sure if this backtrace is related or even expected:

[5.626874] Freeing unused kernel memory: 2048K
[5.632309] [ cut here ]
[5.636982] WARNING: CPU: 0 PID: 1 at arch/arm/mm/dump.c:248 
note_page+0x3d0/0x3dc
[5.644576] arm/mm: Found insecure W+X mapping at address 0xf0879000
[5.650947] Modules linked in:   
[5.654033] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-4-armmp #1 
Debian 5.9.11-1
[5.661953] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[5.668483] Backtrace: 
[5.670946] [] (dump_backtrace) from [] 
(show_stack+0x20/0x24)
[5.678522]  r7:00f8 r6:6013 r5: r4:c136910c
[5.684192] [] (show_stack) from [] 
(dump_stack+0xc8/0xdc)
[5.691430] [] (dump_stack) from [] (__warn+0xe0/0x148)
[5.698396]  r7:00f8 r6:0009 r5:c031eb6c r4:c0eb24e0
[5.704062] [] (__warn) from [] 
(warn_slowpath_fmt+0xa4/0xe4)
[5.711548]  r7:c031eb6c r6:00f8 r5:c0eb24e0 r4:c0eb24ac
[5.717214] [] (warn_slowpath_fmt) from [] 
(note_page+0x3d0/0x3dc)
[5.725135]  r8: r7: r6:0005 r5:c130c440 r4:ee951f28
[5.731841] [] (note_page) from [] 
(walk_pmd+0xe8/0x1a4)
[5.738895]  r10:ee951f28 r9:c0207c20 r8:ee8c6800 r7: r6:c0eb2528 
r5:f087b000
[5.746724]  r4:ee8c61ec
[5.749263] [] (walk_pmd) from [] 
(ptdump_check_wx+0x88/0x104)
[5.756838]  r10: r9: r8: r7: r6:c0208000 
r5:f080
[5.764667]  r4:c0207c28
[5.767215] [] (ptdump_check_wx) from [] 
(mark_rodata_ro+0x3c/0x40)
[5.775223]  r6: r5:c0c9ab94 r4:
[5.779853] [] (mark_rodata_ro) from [] 
(kernel_init+0x44/0x130)
[5.787605] [] (kernel_init) from [] 
(ret_from_fork+0x14/0x2c)
[5.795176] Exception stack(0xee951fb0 to 0xee951ff8)
[5.800232] 1fa0:   
 
[5.808414] 1fc0:       
 
[5.816595] 1fe0:     0013 
[5.823210]  r5:c0c9ab94 r4:
[5.826848] ---[ end trace eee50453771fe1ab ]---
[5.831714] Checked W+X mappings: FAILED, 1 W+X pages found
[5.837378] Run /init as init process


I run in a current xbian image a few days back on the same machine in the same 
issue. This was buster based, but had a recent kernel. This indicates that the 
issue is related with the kernel. 

The same network cable works on a buster x86 system and even with the cubox-i 
with a SD card with and older xbian version without any issues.

Any hint or idea what could be wrong is apprectiated.

If anybody has a way to get debug data (e.g. dmseg output) easier from the 
cubox than copy and paste from minicom, that would also help.

Many thanks
Rainer


-- 
Rainer Dorsch
http://bokomoko.de/




Re: bullseye-installer fails on Cubox-i due to networking issue

2020-12-24 Thread Arnd Bergmann
On Thu, Dec 24, 2020 at 3:38 PM Rainer Dorsch  wrote:
>
> Hi,
>
> I tried to run the bullseye installer from
>
> http://ftp2.de.debian.org/debian/dists/bullseye/main/installer-armhf/current/
> images/netboot/SD-card-images/
>
> on a cubox-i using a serial console today.
>
> It seems the network interface does not come up properly:
>
> ~ # dmesg |grep eth
> [5.009246] fec 2188000.ethernet: Invalid MAC address: 00:00:00:00:00:00
> [5.015982] fec 2188000.ethernet: Using random MAC address: 4a:
> 0d:a7:66:c1:e6
> [5.028381] mdio_bus 2188000.ethernet-1: MDIO device at address 0 is
> missing.
> [  138.479638] fec 2188000.ethernet eth0: Unable to connect to phy
> [  139.674014] fec 2188000.ethernet eth0: Unable to connect to phy
> [  141.218830] fec 2188000.ethernet eth0: Unable to connect to phy
> [  147.400881] fec 2188000.ethernet eth0: Unable to connect to phy
> [  199.375688] fec 2188000.ethernet eth0: Unable to connect to phy
> [  736.031852] fec 2188000.ethernet eth0: Unable to connect to phy
> [  906.069383] fec 2188000.ethernet eth0: Unable to connect to phy
> [ 1156.891662] fec 2188000.ethernet eth0: Unable to connect to phy
> [ 1266.982998] fec 2188000.ethernet eth0: Unable to connect to phy

Which was the last kernel version on which it worked correctly?

There were a couple of regressions based on incorrect phy-mode
settings after a phy driver changed its behavior in an incompatible
way.

This should be the relevant hunk in your board, it was merged into
linux-5.2:

0672d22a1924 ("ARM: dts: imx: Fix the AR803X phy-mode")

diff --git a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
index 4ccb7afc4b35..6d7f6b9035bc 100644
--- a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
@@ -53,7 +53,7 @@ vcc_3v3: regulator-vcc-3v3 {
 &fec {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_microsom_enet_ar8035>;
-   phy-mode = "rgmii";
+   phy-mode = "rgmii-id";
phy-reset-duration = <2>;
phy-reset-gpios = <&gpio4 15 GPIO_ACTIVE_LOW>;
status = "okay";

If you have a dtb file from before that change and want to run it
on a newer kernel, at least this change is needed.

> ~ #
>
> Not sure if this backtrace is related or even expected:
>
> [5.626874] Freeing unused kernel memory: 2048K
> [5.632309] [ cut here ]
> [5.636982] WARNING: CPU: 0 PID: 1 at arch/arm/mm/dump.c:248
> note_page+0x3d0/0x3dc
> [5.644576] arm/mm: Found insecure W+X mapping at address 0xf0879000
> [5.650947] Modules linked in:
> [5.654033] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-4-armmp #1
> Debian 5.9.11-1
> [5.661953] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)

I see the current kernel version here, which is helpful in figuring out the
problem, but I don't think the warning is relevant here.

I do see two code changes that may be relevant

0da1ccbbefb6 ("net: fec: Fix PHY init after phy_reset_after_clk_enable()")
1e6114f51f9d ("net: fec: fix MDIO probing for some FEC hardware blocks")

both of them are backported into linux-5.9.y and are part of 5.9.7 or newer,
so you probably have them already, but there is a chance that one of
these patches caused a regression, so maybe try a v5.9.0 for comparison.

  Arnd



Re: bullseye-installer fails on Cubox-i due to networking issue

2020-12-25 Thread Rainer Dorsch
Hi Arnd,

many thanks for the quick response.

There is good news: Networking works both on the LibreELEC (which uses 5.10.1) 
and the daily builds of the Debian bullseye installer (which uses 5.9.15). The 
problem is in the latest release of the Debian bullseye installer Alpha3  
which I think uses 5.9.11

I extracted imx6q-cubox-i.dtb from the bullseye alpha3 installer which has the 
modification:

rd@h370:~/tmp.nobackup$ fdtdump imx6q-cubox-i.dtb | grep phy-mode

 fdtdump is a low-level debugging tool, not meant for general use.
 If you want to decompile a dtb, you probably want
 dtc -I dtb -O dts 

phy-mode = "rgmii-id";
rd@h370:~/tmp.nobackup$ 

The xbian image I tested uses 5.9.12 which has the networking issue as well.

Given that it seems to work with the latest versions of the 5.9 and 5.10 
series and the daily build of the bullseye installer, I think the topic can be 
considered as resolved.

Thanks again
Rainer

Am Donnerstag, 24. Dezember 2020, 17:42:31 CET schrieb Arnd Bergmann:
> On Thu, Dec 24, 2020 at 3:38 PM Rainer Dorsch  wrote:
> > Hi,
> > 
> > I tried to run the bullseye installer from
> > 
> > http://ftp2.de.debian.org/debian/dists/bullseye/main/installer-armhf/curre
> > nt/ images/netboot/SD-card-images/
> > 
> > on a cubox-i using a serial console today.
> > 
> > It seems the network interface does not come up properly:
> > 
> > ~ # dmesg |grep eth
> > [5.009246] fec 2188000.ethernet: Invalid MAC address:
> > 00:00:00:00:00:00
> > [5.015982] fec 2188000.ethernet: Using random MAC address: 4a:
> > 0d:a7:66:c1:e6
> > [5.028381] mdio_bus 2188000.ethernet-1: MDIO device at address 0 is
> > missing.
> > [  138.479638] fec 2188000.ethernet eth0: Unable to connect to phy
> > [  139.674014] fec 2188000.ethernet eth0: Unable to connect to phy
> > [  141.218830] fec 2188000.ethernet eth0: Unable to connect to phy
> > [  147.400881] fec 2188000.ethernet eth0: Unable to connect to phy
> > [  199.375688] fec 2188000.ethernet eth0: Unable to connect to phy
> > [  736.031852] fec 2188000.ethernet eth0: Unable to connect to phy
> > [  906.069383] fec 2188000.ethernet eth0: Unable to connect to phy
> > [ 1156.891662] fec 2188000.ethernet eth0: Unable to connect to phy
> > [ 1266.982998] fec 2188000.ethernet eth0: Unable to connect to phy
> 
> Which was the last kernel version on which it worked correctly?
> There were a couple of regressions based on incorrect phy-mode
> settings after a phy driver changed its behavior in an incompatible
> way.
> 
> This should be the relevant hunk in your board, it was merged into
> linux-5.2:
> 
> 0672d22a1924 ("ARM: dts: imx: Fix the AR803X phy-mode")
> 
> diff --git a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
> b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
> index 4ccb7afc4b35..6d7f6b9035bc 100644
> --- a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
> @@ -53,7 +53,7 @@ vcc_3v3: regulator-vcc-3v3 {
>  &fec {
> pinctrl-names = "default";
> pinctrl-0 = <&pinctrl_microsom_enet_ar8035>;
> -   phy-mode = "rgmii";
> +   phy-mode = "rgmii-id";
> phy-reset-duration = <2>;
> phy-reset-gpios = <&gpio4 15 GPIO_ACTIVE_LOW>;
> status = "okay";
> 
> If you have a dtb file from before that change and want to run it
> on a newer kernel, at least this change is needed.
> > ~ #
> > 
> > Not sure if this backtrace is related or even expected:
> > 
> > [5.626874] Freeing unused kernel memory: 2048K
> > [5.632309] [ cut here ]
> > [5.636982] WARNING: CPU: 0 PID: 1 at arch/arm/mm/dump.c:248
> > note_page+0x3d0/0x3dc
> > [5.644576] arm/mm: Found insecure W+X mapping at address 0xf0879000
> > [5.650947] Modules linked in:
> > [5.654033] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-4-armmp #1
> > Debian 5.9.11-1
> > [5.661953] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> 
> I see the current kernel version here, which is helpful in figuring out the
> problem, but I don't think the warning is relevant here.
> 
> I do see two code changes that may be relevant
> 
> 0da1ccbbefb6 ("net: fec: Fix PHY init after phy_reset_after_clk_enable()")
> 1e6114f51f9d ("net: fec: fix MDIO probing for some FEC hardware blocks")
> 
> both of them are backported into linux-5.9.y and are part of 5.9.7 or newer,
> so you probably have them already, but there is a chance that one of these
> patches caused a regression, so maybe try a v5.9.0 for comparison.

-- 
Rainer Dorsch
http://bokomoko.de/