Re: Regression in backport MEMREAD ioctl ? [Was: Re: mt7622: belkin-rt3200: r22602-42eeb22450: Kernel panic: kernel stack overflow]

2023-04-29 Thread Michał Kępień via openwrt-devel
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.--- Begin Message ---
> > https://github.com/openwrt/openwrt/pull/12472
> 
> thanks a lot for your attempt, but unfortunately it didn't fixed the issue. 
> 
> I've tried to revert commit fa4dc86 ("kernel: backport MEMREAD ioctl") and
> that fixed the issue as Felix already hinted.

Just to close the loop here: a different fix has been prepared that
replaces recursion with iteration in the mtk_bmt driver:

https://github.com/openwrt/openwrt/pull/12494

This revised fix appears to be working:

https://github.com/openwrt/openwrt/pull/12472#issuecomment-1525636591

-- 
Best regards,
Michał Kępień


--- End Message ---
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: Regression in backport MEMREAD ioctl ? [Was: Re: mt7622: belkin-rt3200: r22602-42eeb22450: Kernel panic: kernel stack overflow]

2023-04-24 Thread Petr Štetiar
Michał Kępień  [2023-04-24 10:27:48]:

Hi Michał,

> > > Since the panic message includes mentions of a stack overflow, another
> > > idea would be to backport this upstream patch as well:
> > > 
> > >  
> > > https://lore.kernel.org/linux-mtd/20230417205654.1982368-1-a...@kernel.org/
> > > 
> > > This patch has been reviewed, but it has not yet been merged anywhere.
> > 
> > Please send a patch to the openwrt mailing list or create a pull request on
> > github.
> 
> https://github.com/openwrt/openwrt/pull/12472

thanks a lot for your attempt, but unfortunately it didn't fixed the issue. 

I've tried to revert commit fa4dc86 ("kernel: backport MEMREAD ioctl") and
that fixed the issue as Felix already hinted.

Cheers,

Petr

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: Regression in backport MEMREAD ioctl ? [Was: Re: mt7622: belkin-rt3200: r22602-42eeb22450: Kernel panic: kernel stack overflow]

2023-04-24 Thread Michał Kępień via openwrt-devel
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.--- Begin Message ---
> > Since the panic message includes mentions of a stack overflow, another
> > idea would be to backport this upstream patch as well:
> > 
> >  
> > https://lore.kernel.org/linux-mtd/20230417205654.1982368-1-a...@kernel.org/
> > 
> > This patch has been reviewed, but it has not yet been merged anywhere.
> 
> Please send a patch to the openwrt mailing list or create a pull request on
> github.

https://github.com/openwrt/openwrt/pull/12472

-- 
Best regards,
Michał Kępień


--- End Message ---
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: Regression in backport MEMREAD ioctl ? [Was: Re: mt7622: belkin-rt3200: r22602-42eeb22450: Kernel panic: kernel stack overflow]

2023-04-21 Thread Hauke Mehrtens

On 4/21/23 15:17, Michał Kępień wrote:

Hi Petr,


Since the crash happens right after snand driver initialization, I think the
most likely candidate is this one:
fa4dc86e9808 kernel: backport MEMREAD ioctl

Maybe there are still some stack declarations of struct mtd_oob_ops left
that aren't fully initialized.


thanks for looking into that Felix, Michał any idea what might be wrong here?


I remember looking for uninitialized fields in all existing instances of
struct mtd_oob_ops in version 5.15.98 of the Linux kernel source tree
while preparing the MEMREAD backports.  However, it did not occur to me
to check OpenWRT-specific patches in the same way (sorry!) - and a naïve
search uncovers these two locations:

 $ git grep -E 'struct mtd_oob_ops [^=*{}]+;' -- 
':!target/linux/generic/backport-5.15/'
 
package/boot/uboot-mediatek/patches/100-07-mtd-nmbm-add-support-for-mtd.patch:+ 
struct mtd_oob_ops ops;
 
package/boot/uboot-mediatek/patches/100-07-mtd-nmbm-add-support-for-mtd.patch:+ 
struct mtd_oob_ops ops;
 
package/boot/uboot-mediatek/patches/100-11-env-add-support-for-NMBM-upper-MTD-layer.patch:+
 struct mtd_oob_ops ops;


These patches are applied to U-Boot and not the kernel. The 
"fa4dc86e9808 kernel: backport MEMREAD ioctl"  change only changes he 
kernel.




Since the panic message includes mentions of a stack overflow, another
idea would be to backport this upstream patch as well:

 https://lore.kernel.org/linux-mtd/20230417205654.1982368-1-a...@kernel.org/

This patch has been reviewed, but it has not yet been merged anywhere.


Please send a patch to the openwrt mailing list or create a pull request 
on github.


hauke

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: Regression in backport MEMREAD ioctl ? [Was: Re: mt7622: belkin-rt3200: r22602-42eeb22450: Kernel panic: kernel stack overflow]

2023-04-21 Thread Michał Kępień via openwrt-devel
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.--- Begin Message ---
Hi Petr,

> > Since the crash happens right after snand driver initialization, I think the
> > most likely candidate is this one:
> > fa4dc86e9808 kernel: backport MEMREAD ioctl
> > 
> > Maybe there are still some stack declarations of struct mtd_oob_ops left
> > that aren't fully initialized.
> 
> thanks for looking into that Felix, Michał any idea what might be wrong here?

I remember looking for uninitialized fields in all existing instances of
struct mtd_oob_ops in version 5.15.98 of the Linux kernel source tree
while preparing the MEMREAD backports.  However, it did not occur to me
to check OpenWRT-specific patches in the same way (sorry!) - and a naïve
search uncovers these two locations:

$ git grep -E 'struct mtd_oob_ops [^=*{}]+;' -- 
':!target/linux/generic/backport-5.15/'

package/boot/uboot-mediatek/patches/100-07-mtd-nmbm-add-support-for-mtd.patch:+ 
struct mtd_oob_ops ops;

package/boot/uboot-mediatek/patches/100-07-mtd-nmbm-add-support-for-mtd.patch:+ 
struct mtd_oob_ops ops;

package/boot/uboot-mediatek/patches/100-11-env-add-support-for-NMBM-upper-MTD-layer.patch:+
 struct mtd_oob_ops ops;

Both structures in the first patch are zeroed out using memset() after
they are declared, so that's fine, but the one in the second patch
isn't.

Given that MediaTek hardware is involved here, this sounds like a solid
lead.  Updating 100-11-env-add-support-for-NMBM-upper-MTD-layer.patch so
that the line quoted above says this instead:

struct mtd_oob_ops ops = {};

would be my first suggestion.

Since the panic message includes mentions of a stack overflow, another
idea would be to backport this upstream patch as well:

https://lore.kernel.org/linux-mtd/20230417205654.1982368-1-a...@kernel.org/

This patch has been reviewed, but it has not yet been merged anywhere.

Hope this helps,

-- 
Best regards,
Michał Kępień


--- End Message ---
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Regression in backport MEMREAD ioctl ? [Was: Re: mt7622: belkin-rt3200: r22602-42eeb22450: Kernel panic: kernel stack overflow]

2023-04-21 Thread Petr Štetiar
Felix Fietkau  [2023-04-21 12:03:23]:

[ adding Michał and Christian to the mail loop]

> On 21.04.23 09:11, Petr Štetiar wrote:
> > Hi,
> > 
> > I've just noticed, that daily CI runtime testing job on belkin-rt3200
> > failed[1] due to following:
> > 
> >   Insufficient stack space to handle exception!
> >   ESR: 0x9647 -- DABT (current EL)
> >   FAR: 0xffc008c47fe0
> >   Task stack: [0xffc008c48000..0xffc008c4c000]
> >   IRQ stack:  [0xffc008008000..0xffc00800c000]
> >   Overflow stack: [0xff801feb00a0..0xff801feb10a0]
> >   CPU: 1 PID: 1 Comm: swapper/0 Tainted: G S5.15.107 #0
> >   Hardware name: Linksys E8450 (DT)
> >   pstate: 80c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >   pc : dequeue_entity+0x0/0x250
> >   lr : dequeue_task_fair+0x98/0x290
> >   sp : ffc008c48030
> >   x29: ffc008c48030 x28: 0001 x27: ff801feb6380
> >   x26: 0001 x25: ff801feb6300 x24: ff868000
> >   x23: 0001 x22: 0009 x21: 
> >   x20: ff801feb6380 x19: ff868080 x18: 17a740a6
> >   x17: ffc008bae748 x16: ffc008bae6d8 x15: 
> >   x14:  x13:  x12: 000f0101
> >   x11: 0449 x10: 0127 x9 : 
> >   x8 : 0125 x7 : 00116da1 x6 : 00116da1
> >   x5 : 001165a1 x4 : ff801feb6e00 x3 : 
> >   x2 : 0009 x1 : ff868080 x0 : ff801feb6380
> >   Kernel panic - not syncing: kernel stack overflow
> >   SMP: stopping secondary CPUs
> >   SMP: failed to stop secondary CPUs 0-1
> >   Kernel Offset: disabled
> >   CPU features: 0x3000,0802
> >   Memory Limit: none
> > 
> > Last working version was r22580-e11d00d44c[2], and first failing version was
> > yesterday 1416b9bbe9, so possibly the regression was introduced in one of 
> > the
> > following commits:
> > 
> >   1416b9bbe9d3 tools/dwarves: update to 1.25
> >   9931188edcbc kernel: fix up qrtr packaging after 5.15.107 bump
> >   f4989239cc91 kernel: bump 5.15 to 5.15.107
> >   89f6ac5fd1ad tools/cmake: update to 3.26.3
> >   ab3f151aa874 mwlwifi: update to version 10.3.9.0-20230311
> >   5ec781c4448b bmips: pci-bcm6348: load IO resource from DT ranges
> >   16b0cbbde057 bmips: drop unneeded ath9k fixup
> >   db4f158c0330 bmips: hg556a: switch to kmod-owl-loader
> >   36150ff6ffb2 tools/bzip2: add `bzip2` binaries
> >   b691362d1dbe Revert "tools/bzip2: add `bzip2` binaries"
> >   f7f47b136991 mac80211: ath11k: replace 160MHz fix with upstream pending 
> > one
> >   4ab4b9ea818d build: fix incorrect initramfs gzip compression
> >   69bc620180d2 build: fix incorrect initramfs bzip2 compression
> >   394d7134ec42 tools/bzip2: add `bzip2` binaries
> >   5264296ce480 ath79: mikrotik: update kernel on NAND using Yafut
> >   27acf2413e91 yafut: add a kernel update tool for MikroTik NAND
> >   fa4dc86e9808 kernel: backport MEMREAD ioctl
> >   e722b667c5a5 mac80211: update to v6.1.24
> 
> Since the crash happens right after snand driver initialization, I think the
> most likely candidate is this one:
> fa4dc86e9808 kernel: backport MEMREAD ioctl
> 
> Maybe there are still some stack declarations of struct mtd_oob_ops left
> that aren't fully initialized.

thanks for looking into that Felix, Michał any idea what might be wrong here?

Cheers,

Petr

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel