Re: [syzbot] kernel panic: corrupted stack end in openat
On Wed, Mar 17, 2021 at 9:32 AM Arnd Bergmann wrote: > > > > > wrote: > > > > > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > > > > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov > > > > > > > wrote: > > > > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > > > > > > > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > > > > > > the closest I have installed, and I think the Debian and Ubuntu > > > > > > > versions > > > > > > > are generally quite close in case of gcc since they are > > > > > > > maintained by > > > > > > > the same packagers. > > > > > > > > > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack > > > > > > space. Could it be the syzbot's gcc is doing something weird and > > > > > > inflating the stack frames? > > > > > > > > > > It's possible, I think that's really unlikely given that it's just > > > > > Debian's > > > > > gcc, which is as close to mainline as the version I was using. > > > > > > > > > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might > > > > > help if this was the problem though: > > > > > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koe...@pengutronix.de/ > > > > > > > > > > My best guess is something going wrong in the interrupt > > > > > that triggered the preempt_schedule() which ended up calling > > > > > task_stack_end_corrupted() in schedule_debug(), as you suggested > > > > > earlier. > > > > > > > > FWIW I see slightly larger frames with the config: > > > > > > > > 073ab64 : > > > > 8073ab64: e1a0c00dmov ip, sp > > > > 8073ab68: e92ddff0push{r4, r5, r6, r7, r8, r9, sl, > > > > fp, ip, lr, pc} > > > > 8073ab6c: e24cb004sub fp, ip, #4 > > > > 8073ab70: e24ddfa7sub sp, sp, #668; 0x29c > > > > > > Yes, this is the one that the compiler complained about when warning > > > for stack over 600 bytes. It's not called in this call chain though. > > > > > > > page_alloc can also do reclaim, I had the impression that reclaim can > > > > be quite heavy-weight in all respects. > > > > > > Yes, that is another possibility. What writable file systems or swap > > > do you normally have mounted that it could be writing to, and on > > > what storage device? > > > > The root fs is ext4 on virtio-blk. > > > > There are also several dozens of shrinkers that can be called during > > reclaim: > > https://elixir.bootlin.com/linux/latest/C/ident/unregister_shrinker > > Right, unfortunately I don't see a smoking gun there either, unless you are > also using NFS or devicemapper. > > Implementing VMAP_STACK as you suggested earlier is probably the > best way to figure out if there is an actual overrun of the stack. > Alternatively, adding support for GCC_PLUGIN_STACKLEAK might > also help find out if we ever get close to the limit. This is probably > less work, but it might not actually help in this case. VMAP_STACK is quite intrusive as far as I understand. For KASAN I considered a simpler option: have a debug config that allocates an extra page after the stack and mprotect's it. It wastes a physical page per task (fine for a debug config), but I would assume should be radically simpler to implement. In the end somebody implemented proper VMAP_STACK support for KASAN, but I still think it may be a reasonable compromise between time investment and value.
Re: [syzbot] kernel panic: corrupted stack end in openat
On Wed, Mar 17, 2021 at 8:52 AM Dmitry Vyukov wrote: > On Tue, Mar 16, 2021 at 5:28 PM Arnd Bergmann wrote: > > On Tue, Mar 16, 2021 at 5:13 PM Dmitry Vyukov wrote: > > > On Tue, Mar 16, 2021 at 5:03 PM Arnd Bergmann wrote: > > > > On Tue, Mar 16, 2021 at 4:51 PM Russell King - ARM Linux admin > > > > wrote: > > > > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > > > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov > > > > > > wrote: > > > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > > > > > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > > > > > the closest I have installed, and I think the Debian and Ubuntu > > > > > > versions > > > > > > are generally quite close in case of gcc since they are maintained > > > > > > by > > > > > > the same packagers. > > > > > > > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack > > > > > space. Could it be the syzbot's gcc is doing something weird and > > > > > inflating the stack frames? > > > > > > > > It's possible, I think that's really unlikely given that it's just > > > > Debian's > > > > gcc, which is as close to mainline as the version I was using. > > > > > > > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might > > > > help if this was the problem though: > > > > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koe...@pengutronix.de/ > > > > > > > > My best guess is something going wrong in the interrupt > > > > that triggered the preempt_schedule() which ended up calling > > > > task_stack_end_corrupted() in schedule_debug(), as you suggested > > > > earlier. > > > > > > FWIW I see slightly larger frames with the config: > > > > > > 073ab64 : > > > 8073ab64: e1a0c00dmov ip, sp > > > 8073ab68: e92ddff0push{r4, r5, r6, r7, r8, r9, sl, > > > fp, ip, lr, pc} > > > 8073ab6c: e24cb004sub fp, ip, #4 > > > 8073ab70: e24ddfa7sub sp, sp, #668; 0x29c > > > > Yes, this is the one that the compiler complained about when warning > > for stack over 600 bytes. It's not called in this call chain though. > > > > > page_alloc can also do reclaim, I had the impression that reclaim can > > > be quite heavy-weight in all respects. > > > > Yes, that is another possibility. What writable file systems or swap > > do you normally have mounted that it could be writing to, and on > > what storage device? > > The root fs is ext4 on virtio-blk. > > There are also several dozens of shrinkers that can be called during reclaim: > https://elixir.bootlin.com/linux/latest/C/ident/unregister_shrinker Right, unfortunately I don't see a smoking gun there either, unless you are also using NFS or devicemapper. Implementing VMAP_STACK as you suggested earlier is probably the best way to figure out if there is an actual overrun of the stack. Alternatively, adding support for GCC_PLUGIN_STACKLEAK might also help find out if we ever get close to the limit. This is probably less work, but it might not actually help in this case. Arnd
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 5:28 PM Arnd Bergmann wrote: > > On Tue, Mar 16, 2021 at 5:13 PM Dmitry Vyukov wrote: > > > > On Tue, Mar 16, 2021 at 5:03 PM Arnd Bergmann wrote: > > > > > > On Tue, Mar 16, 2021 at 4:51 PM Russell King - ARM Linux admin > > > wrote: > > > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov > > > > > wrote: > > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > > > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > > > > the closest I have installed, and I think the Debian and Ubuntu > > > > > versions > > > > > are generally quite close in case of gcc since they are maintained by > > > > > the same packagers. > > > > > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack > > > > space. Could it be the syzbot's gcc is doing something weird and > > > > inflating the stack frames? > > > > > > It's possible, I think that's really unlikely given that it's just > > > Debian's > > > gcc, which is as close to mainline as the version I was using. > > > > > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might > > > help if this was the problem though: > > > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koe...@pengutronix.de/ > > > > > > My best guess is something going wrong in the interrupt > > > that triggered the preempt_schedule() which ended up calling > > > task_stack_end_corrupted() in schedule_debug(), as you suggested > > > earlier. > > > > FWIW I see slightly larger frames with the config: > > > > 073ab64 : > > 8073ab64: e1a0c00dmov ip, sp > > 8073ab68: e92ddff0push{r4, r5, r6, r7, r8, r9, sl, > > fp, ip, lr, pc} > > 8073ab6c: e24cb004sub fp, ip, #4 > > 8073ab70: e24ddfa7sub sp, sp, #668; 0x29c > > Yes, this is the one that the compiler complained about when warning > for stack over 600 bytes. It's not called in this call chain though. > > > page_alloc can also do reclaim, I had the impression that reclaim can > > be quite heavy-weight in all respects. > > Yes, that is another possibility. What writable file systems or swap > do you normally have mounted that it could be writing to, and on > what storage device? The root fs is ext4 on virtio-blk. There are also several dozens of shrinkers that can be called during reclaim: https://elixir.bootlin.com/linux/latest/C/ident/unregister_shrinker
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 5:13 PM Dmitry Vyukov wrote: > > On Tue, Mar 16, 2021 at 5:03 PM Arnd Bergmann wrote: > > > > On Tue, Mar 16, 2021 at 4:51 PM Russell King - ARM Linux admin > > wrote: > > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov > > > > wrote: > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > > > the closest I have installed, and I think the Debian and Ubuntu versions > > > > are generally quite close in case of gcc since they are maintained by > > > > the same packagers. > > > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack > > > space. Could it be the syzbot's gcc is doing something weird and > > > inflating the stack frames? > > > > It's possible, I think that's really unlikely given that it's just Debian's > > gcc, which is as close to mainline as the version I was using. > > > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might > > help if this was the problem though: > > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koe...@pengutronix.de/ > > > > My best guess is something going wrong in the interrupt > > that triggered the preempt_schedule() which ended up calling > > task_stack_end_corrupted() in schedule_debug(), as you suggested > > earlier. > > FWIW I see slightly larger frames with the config: > > 073ab64 : > 8073ab64: e1a0c00dmov ip, sp > 8073ab68: e92ddff0push{r4, r5, r6, r7, r8, r9, sl, > fp, ip, lr, pc} > 8073ab6c: e24cb004sub fp, ip, #4 > 8073ab70: e24ddfa7sub sp, sp, #668; 0x29c Yes, this is the one that the compiler complained about when warning for stack over 600 bytes. It's not called in this call chain though. > page_alloc can also do reclaim, I had the impression that reclaim can > be quite heavy-weight in all respects. Yes, that is another possibility. What writable file systems or swap do you normally have mounted that it could be writing to, and on what storage device? Arnd
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 5:03 PM Arnd Bergmann wrote: > > On Tue, Mar 16, 2021 at 4:51 PM Russell King - ARM Linux admin > wrote: > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov wrote: > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > > the closest I have installed, and I think the Debian and Ubuntu versions > > > are generally quite close in case of gcc since they are maintained by > > > the same packagers. > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack > > space. Could it be the syzbot's gcc is doing something weird and > > inflating the stack frames? > > It's possible, I think that's really unlikely given that it's just Debian's > gcc, which is as close to mainline as the version I was using. > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might > help if this was the problem though: > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koe...@pengutronix.de/ > > My best guess is something going wrong in the interrupt > that triggered the preempt_schedule() which ended up calling > task_stack_end_corrupted() in schedule_debug(), as you suggested > earlier. FWIW I see slightly larger frames with the config: 073ab64 : 8073ab64: e1a0c00dmov ip, sp 8073ab68: e92ddff0push{r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc} 8073ab6c: e24cb004sub fp, ip, #4 8073ab70: e24ddfa7sub sp, sp, #668; 0x29c page_alloc can also do reclaim, I had the impression that reclaim can be quite heavy-weight in all respects.
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 4:51 PM Russell King - ARM Linux admin wrote: > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov wrote: > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > the closest I have installed, and I think the Debian and Ubuntu versions > > are generally quite close in case of gcc since they are maintained by > > the same packagers. > > ... which shouldn't be a problem - that's just over 1/4 of the stack > space. Could it be the syzbot's gcc is doing something weird and > inflating the stack frames? It's possible, I think that's really unlikely given that it's just Debian's gcc, which is as close to mainline as the version I was using. Uwe's DEBUG_STACKOVERFLOW patch from a while ago might help if this was the problem though: https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koe...@pengutronix.de/ My best guess is something going wrong in the interrupt that triggered the preempt_schedule() which ended up calling task_stack_end_corrupted() in schedule_debug(), as you suggested earlier. Arnd
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 4:51 PM Russell King - ARM Linux admin wrote: > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov wrote: > > > On Tue, Mar 16, 2021 at 11:02 AM Arnd Bergmann wrote: > > > > > On Tue, Mar 16, 2021 at 8:18 AM syzbot > > > > > > > > > [<8073772c>] (integrity_kernel_read) from [<8073a904>] > > > > > > (ima_calc_file_hash_tfm+0x178/0x228 > > > > > > security/integrity/ima/ima_crypto.c:484) > > > > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > > > > (ima_calc_file_shash security/integrity/ima/ima_crypto.c:515 > > > > > > [inline]) > > > > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > > > > (ima_calc_file_hash+0x124/0x8b8 > > > > > > security/integrity/ima/ima_crypto.c:572) > > > > > > > > ima_calc_file_hash_tfm() has a SHASH_DESC_ON_STACK(), which by itself > > > > can > > > > use up 512 bytes, but KASAN sometimes triples this number. However, I > > > > see > > > > you do not actually have KASAN enabled, so there is probably more to it. > > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > the closest I have installed, and I think the Debian and Ubuntu versions > > are generally quite close in case of gcc since they are maintained by > > the same packagers. > > > > I see ima_calc_field_array_hash_tfm() shows up as one of the larger > > stack users, but not alarmingly high: > > ../security/integrity/ima/ima_crypto.c: In function > > ‘ima_calc_field_array_hash_tfm’: > > ../security/integrity/ima/ima_crypto.c:624:1: warning: the frame size > > of 664 bytes is larger than 600 bytes [-Wframe-larger-than=] > > none of the other functions from the call chain have more than 600 bytes > > in this combination of config/compiler/sourcetree. > > > > In combination, I don't get to more than ~2300 bytes: > > ... which shouldn't be a problem - that's just over 1/4 of the stack > space. Could it be the syzbot's gcc is doing something weird and > inflating the stack frames? It's just a stock Debian-provided gcc. Which I would assume also just a plain gcc.
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov wrote: > > On Tue, Mar 16, 2021 at 11:02 AM Arnd Bergmann wrote: > > > > On Tue, Mar 16, 2021 at 8:18 AM syzbot > > > > > > > [<8073772c>] (integrity_kernel_read) from [<8073a904>] > > > > > (ima_calc_file_hash_tfm+0x178/0x228 > > > > > security/integrity/ima/ima_crypto.c:484) > > > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > > > (ima_calc_file_shash security/integrity/ima/ima_crypto.c:515 [inline]) > > > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > > > (ima_calc_file_hash+0x124/0x8b8 > > > > > security/integrity/ima/ima_crypto.c:572) > > > > > > ima_calc_file_hash_tfm() has a SHASH_DESC_ON_STACK(), which by itself can > > > use up 512 bytes, but KASAN sometimes triples this number. However, I see > > > you do not actually have KASAN enabled, so there is probably more to it. > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > the closest I have installed, and I think the Debian and Ubuntu versions > are generally quite close in case of gcc since they are maintained by > the same packagers. > > I see ima_calc_field_array_hash_tfm() shows up as one of the larger > stack users, but not alarmingly high: > ../security/integrity/ima/ima_crypto.c: In function > ‘ima_calc_field_array_hash_tfm’: > ../security/integrity/ima/ima_crypto.c:624:1: warning: the frame size > of 664 bytes is larger than 600 bytes [-Wframe-larger-than=] > none of the other functions from the call chain have more than 600 bytes > in this combination of config/compiler/sourcetree. > > In combination, I don't get to more than ~2300 bytes: ... which shouldn't be a problem - that's just over 1/4 of the stack space. Could it be the syzbot's gcc is doing something weird and inflating the stack frames? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov wrote: > On Tue, Mar 16, 2021 at 11:02 AM Arnd Bergmann wrote: > > > On Tue, Mar 16, 2021 at 8:18 AM syzbot > > > > > [<8073772c>] (integrity_kernel_read) from [<8073a904>] > > > > (ima_calc_file_hash_tfm+0x178/0x228 > > > > security/integrity/ima/ima_crypto.c:484) > > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > > (ima_calc_file_shash security/integrity/ima/ima_crypto.c:515 [inline]) > > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > > (ima_calc_file_hash+0x124/0x8b8 security/integrity/ima/ima_crypto.c:572) > > > > ima_calc_file_hash_tfm() has a SHASH_DESC_ON_STACK(), which by itself can > > use up 512 bytes, but KASAN sometimes triples this number. However, I see > > you do not actually have KASAN enabled, so there is probably more to it. > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's the closest I have installed, and I think the Debian and Ubuntu versions are generally quite close in case of gcc since they are maintained by the same packagers. I see ima_calc_field_array_hash_tfm() shows up as one of the larger stack users, but not alarmingly high: ../security/integrity/ima/ima_crypto.c: In function ‘ima_calc_field_array_hash_tfm’: ../security/integrity/ima/ima_crypto.c:624:1: warning: the frame size of 664 bytes is larger than 600 bytes [-Wframe-larger-than=] none of the other functions from the call chain have more than 600 bytes in this combination of config/compiler/sourcetree. In combination, I don't get to more than ~2300 bytes: [<818033d8>] (panic) 52 [<8181f5b8>] (__schedule) 0 [<81820430>] (preempt_schedule_common) 0 [<818204dc>] (preempt_schedule) 0 [<8048c7c0>] (kernel_init_free_pages) 148 [<804916ac>] (get_page_from_freelist 212 [<80493264>] (__alloc_pages_nodemask) 44 [<8042f034>] (page_cache_ra_unbounded) 36 [<8042f2c8>] (do_page_cache_ra) 28 [<8042f418>] (ondemand_readahead) 0 [<8042f894>] (page_cache_async_ra) 68 [<80420ac8>] (filemap_get_pages) 120 [<80421110>] (filemap_read) 36 [<804215f0>] (generic_file_read_iter) 8 [<805ff430>] (ext4_file_read_iter) 96 [<804da3cc>] (__kernel_read) 8 [<8073772c>] (integrity_kernel_read) 412 [<8073a78c>] (ima_calc_file_hash_tfm) 164 [<8073ad08>] (ima_calc_file_hash) 106 [<8073bf84>] (ima_collect_measurement) 332 [<80738fec>] (process_measurement) 24 [<8073979c>] (ima_file_check) 172 [<804ec66c>] (path_openat) 152 [<804ef670>] (do_filp_open) 40 [<804d79c4>] (do_sys_openat2) > Re printing FP, syzbot does not use custom patches: > http://bit.do/syzbot#no-custom-patches > But this does not seem to be syzbot-specific. It seems that any arm32 > stack overflow report will be unactionable, so I think it would be > useful to include this into the mainline kernel to make overflow > reports useful for everybody (and for syzbot as a side effect). ok. Arnd
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, 16 Mar 2021 at 11:04, Arnd Bergmann wrote: > > On Tue, Mar 16, 2021 at 8:59 AM Dmitry Vyukov wrote: > > > > On Tue, Mar 16, 2021 at 8:18 AM syzbot > > wrote: > > > > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit:1e28eed1 Linux 5.12-rc3 > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=167535e6d0 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=e0cee1f53de33ca3 > > > dashboard link: > > > https://syzkaller.appspot.com/bug?extid=0b06ef9b44d00d600183 > > > userspace arch: arm > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the > > > commit: > > > Reported-by: syzbot+0b06ef9b44d00d600...@syzkaller.appspotmail.com > > > > +arm32 maintainer > > I think this is a real stack overflow on arm32, the stack is indeed deep. > > Nice find. I see there was already a second report, so it seems to be > reproducible as well. > If you are able to trigger this reliably, you could try printing the frame > pointer while unwinding to see what is actually going on: > > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -68,8 +68,8 @@ void dump_backtrace_entry(unsigned long where, > unsigned long from, > unsigned long end = frame + 4 + sizeof(struct pt_regs); > > #ifdef CONFIG_KALLSYMS > - printk("%s[<%08lx>] (%ps) from [<%08lx>] (%pS)\n", > - loglvl, where, (void *)where, from, (void *)from); > + printk("%s[<%08lx>] (%ps) from [<%08lx>] (%pS), frame %08lx\n", > + loglvl, where, (void *)where, from, (void *)from, frame); > #else > printk("%sFunction entered at [<%08lx>] from [<%08lx>]\n", > loglvl, where, from); > > If that doesn't help, I could have a look at the binary to see which > functions in the call chain take a lot of stack space, if any. > > Which exact compiler version do you use for building these > kernels? I can try doing a build with the same commit and config. > > This one function is one that I have seen before when looking at build > warnings with KASAN: > > > > [<8073772c>] (integrity_kernel_read) from [<8073a904>] > > > (ima_calc_file_hash_tfm+0x178/0x228 > > > security/integrity/ima/ima_crypto.c:484) > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > (ima_calc_file_shash security/integrity/ima/ima_crypto.c:515 [inline]) > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > (ima_calc_file_hash+0x124/0x8b8 security/integrity/ima/ima_crypto.c:572) > > ima_calc_file_hash_tfm() has a SHASH_DESC_ON_STACK(), which by itself can > use up 512 bytes, but KASAN sometimes triples this number. However, I see > you do not actually have KASAN enabled, so there is probably more to it. > FYI, as an aside, the SHASH_DESC_ON_STACK() issue was fixed in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=660d2062190db131d2feaf19914e90f868fe285c (note that the size of SHASH_DESC_ON_STACK() accounts for two struct shash_desc instances)
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 11:02 AM Arnd Bergmann wrote: > > On Tue, Mar 16, 2021 at 8:18 AM syzbot > > wrote: > > > > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit:1e28eed1 Linux 5.12-rc3 > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=167535e6d0 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=e0cee1f53de33ca3 > > > dashboard link: > > > https://syzkaller.appspot.com/bug?extid=0b06ef9b44d00d600183 > > > userspace arch: arm > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the > > > commit: > > > Reported-by: syzbot+0b06ef9b44d00d600...@syzkaller.appspotmail.com > > > > +arm32 maintainer > > I think this is a real stack overflow on arm32, the stack is indeed deep. > > Nice find. I see there was already a second report, so it seems to be > reproducible as well. > If you are able to trigger this reliably, you could try printing the frame > pointer while unwinding to see what is actually going on: > > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -68,8 +68,8 @@ void dump_backtrace_entry(unsigned long where, > unsigned long from, > unsigned long end = frame + 4 + sizeof(struct pt_regs); > > #ifdef CONFIG_KALLSYMS > - printk("%s[<%08lx>] (%ps) from [<%08lx>] (%pS)\n", > - loglvl, where, (void *)where, from, (void *)from); > + printk("%s[<%08lx>] (%ps) from [<%08lx>] (%pS), frame %08lx\n", > + loglvl, where, (void *)where, from, (void *)from, frame); > #else > printk("%sFunction entered at [<%08lx>] from [<%08lx>]\n", > loglvl, where, from); > > If that doesn't help, I could have a look at the binary to see which > functions in the call chain take a lot of stack space, if any. > > Which exact compiler version do you use for building these > kernels? I can try doing a build with the same commit and config. > > This one function is one that I have seen before when looking at build > warnings with KASAN: > > > > [<8073772c>] (integrity_kernel_read) from [<8073a904>] > > > (ima_calc_file_hash_tfm+0x178/0x228 > > > security/integrity/ima/ima_crypto.c:484) > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > (ima_calc_file_shash security/integrity/ima/ima_crypto.c:515 [inline]) > > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > > (ima_calc_file_hash+0x124/0x8b8 security/integrity/ima/ima_crypto.c:572) > > ima_calc_file_hash_tfm() has a SHASH_DESC_ON_STACK(), which by itself can > use up 512 bytes, but KASAN sometimes triples this number. However, I see > you do not actually have KASAN enabled, so there is probably more to it. The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) It's available in gcr.io/syzkaller/syzbot container. (syzbot should have been provided the compiler version, something broke, I've filed https://github.com/google/syzkaller/issues/2498 for this) Yes, KASAN is not enabled on arm32 for now. Re printing FP, syzbot does not use custom patches: http://bit.do/syzbot#no-custom-patches But this does not seem to be syzbot-specific. It seems that any arm32 stack overflow report will be unactionable, so I think it would be useful to include this into the mainline kernel to make overflow reports useful for everybody (and for syzbot as a side effect).
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 8:59 AM Dmitry Vyukov wrote: > > On Tue, Mar 16, 2021 at 8:18 AM syzbot > wrote: > > > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit:1e28eed1 Linux 5.12-rc3 > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=167535e6d0 > > kernel config: https://syzkaller.appspot.com/x/.config?x=e0cee1f53de33ca3 > > dashboard link: https://syzkaller.appspot.com/bug?extid=0b06ef9b44d00d600183 > > userspace arch: arm > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+0b06ef9b44d00d600...@syzkaller.appspotmail.com > > +arm32 maintainer > I think this is a real stack overflow on arm32, the stack is indeed deep. Nice find. I see there was already a second report, so it seems to be reproducible as well. If you are able to trigger this reliably, you could try printing the frame pointer while unwinding to see what is actually going on: --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -68,8 +68,8 @@ void dump_backtrace_entry(unsigned long where, unsigned long from, unsigned long end = frame + 4 + sizeof(struct pt_regs); #ifdef CONFIG_KALLSYMS - printk("%s[<%08lx>] (%ps) from [<%08lx>] (%pS)\n", - loglvl, where, (void *)where, from, (void *)from); + printk("%s[<%08lx>] (%ps) from [<%08lx>] (%pS), frame %08lx\n", + loglvl, where, (void *)where, from, (void *)from, frame); #else printk("%sFunction entered at [<%08lx>] from [<%08lx>]\n", loglvl, where, from); If that doesn't help, I could have a look at the binary to see which functions in the call chain take a lot of stack space, if any. Which exact compiler version do you use for building these kernels? I can try doing a build with the same commit and config. This one function is one that I have seen before when looking at build warnings with KASAN: > > [<8073772c>] (integrity_kernel_read) from [<8073a904>] > > (ima_calc_file_hash_tfm+0x178/0x228 security/integrity/ima/ima_crypto.c:484) > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > (ima_calc_file_shash security/integrity/ima/ima_crypto.c:515 [inline]) > > [<8073a78c>] (ima_calc_file_hash_tfm) from [<8073ae2c>] > > (ima_calc_file_hash+0x124/0x8b8 security/integrity/ima/ima_crypto.c:572) ima_calc_file_hash_tfm() has a SHASH_DESC_ON_STACK(), which by itself can use up 512 bytes, but KASAN sometimes triples this number. However, I see you do not actually have KASAN enabled, so there is probably more to it. Arnd
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 10:24 AM Russell King - ARM Linux admin wrote: > > On Tue, Mar 16, 2021 at 08:59:17AM +0100, Dmitry Vyukov wrote: > > On Tue, Mar 16, 2021 at 8:18 AM syzbot > > wrote: > > > > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit:1e28eed1 Linux 5.12-rc3 > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=167535e6d0 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=e0cee1f53de33ca3 > > > dashboard link: > > > https://syzkaller.appspot.com/bug?extid=0b06ef9b44d00d600183 > > > userspace arch: arm > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the > > > commit: > > > Reported-by: syzbot+0b06ef9b44d00d600...@syzkaller.appspotmail.com > > > > +arm32 maintainer > > I think this is a real stack overflow on arm32, the stack is indeed deep. > > There's no way to know for sure because there's no indication of the > stack pointer in this, so we don't know how much space remains. > Therefore we don't know whether this is something in the dumped > path, or an interrupt causing it. Agree, to know for sure we would need support for VMAP_STACK. But do we really need to know it? If it's an interrupt on top, it does not make any difference?
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 08:59:17AM +0100, Dmitry Vyukov wrote: > On Tue, Mar 16, 2021 at 8:18 AM syzbot > wrote: > > > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit:1e28eed1 Linux 5.12-rc3 > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=167535e6d0 > > kernel config: https://syzkaller.appspot.com/x/.config?x=e0cee1f53de33ca3 > > dashboard link: https://syzkaller.appspot.com/bug?extid=0b06ef9b44d00d600183 > > userspace arch: arm > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+0b06ef9b44d00d600...@syzkaller.appspotmail.com > > +arm32 maintainer > I think this is a real stack overflow on arm32, the stack is indeed deep. There's no way to know for sure because there's no indication of the stack pointer in this, so we don't know how much space remains. Therefore we don't know whether this is something in the dumped path, or an interrupt causing it. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: [syzbot] kernel panic: corrupted stack end in openat
On Tue, Mar 16, 2021 at 8:18 AM syzbot wrote: > > Hello, > > syzbot found the following issue on: > > HEAD commit:1e28eed1 Linux 5.12-rc3 > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=167535e6d0 > kernel config: https://syzkaller.appspot.com/x/.config?x=e0cee1f53de33ca3 > dashboard link: https://syzkaller.appspot.com/bug?extid=0b06ef9b44d00d600183 > userspace arch: arm > > Unfortunately, I don't have any reproducer for this issue yet. > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+0b06ef9b44d00d600...@syzkaller.appspotmail.com +arm32 maintainer I think this is a real stack overflow on arm32, the stack is indeed deep. > Kernel panic - not syncing: corrupted stack end detected inside scheduler > CPU: 0 PID: 3263 Comm: syz-fuzzer Not tainted 5.12.0-rc3-syzkaller #0 > Hardware name: ARM-Versatile Express > Backtrace: > [<81802700>] (dump_backtrace) from [<81802974>] (show_stack+0x18/0x1c > arch/arm/kernel/traps.c:252) > r7:0080 r6:6093 r5: r4:82b58544 > [<8180295c>] (show_stack) from [<8180a048>] (__dump_stack lib/dump_stack.c:79 > [inline]) > [<8180295c>] (show_stack) from [<8180a048>] (dump_stack+0xb8/0xe8 > lib/dump_stack.c:120) > [<81809f90>] (dump_stack) from [<81803508>] (panic+0x130/0x378 > kernel/panic.c:231) > r7:81f4bdc0 r6:82a392a4 r5: r4:82c6b0d0 > [<818033d8>] (panic) from [<81820270>] (schedule_debug > kernel/sched/core.c:4822 [inline]) > [<818033d8>] (panic) from [<81820270>] (__schedule+0xcb8/0xcc8 > kernel/sched/core.c:4967) > r3:57ac6e9d r2:0004 r1:81f5a53c r0:81f4bdc0 > r7:0001 > [<8181f5b8>] (__schedule) from [<8182046c>] > (preempt_schedule_common+0x3c/0xac kernel/sched/core.c:5233) > r10:071f r9:ffefd000 r8:0001 r7:81820510 r6:0001 r5:81820510 > r4:85888000 > [<81820430>] (preempt_schedule_common) from [<81820510>] > (preempt_schedule+0x34/0x38 kernel/sched/core.c:5258) > r7:82c6a4e0 r6:0001 r5:85888000 r4:df48d420 > [<818204dc>] (preempt_schedule) from [<8048c884>] (__kunmap_atomic > include/linux/highmem-internal.h:114 [inline]) > [<818204dc>] (preempt_schedule) from [<8048c884>] (clear_highpage > include/linux/highmem.h:204 [inline]) > [<818204dc>] (preempt_schedule) from [<8048c884>] > (kernel_init_free_pages+0xc4/0xd0 mm/page_alloc.c:1212) > [<8048c7c0>] (kernel_init_free_pages) from [<80492ce8>] (post_alloc_hook > mm/page_alloc.c:2305 [inline]) > [<8048c7c0>] (kernel_init_free_pages) from [<80492ce8>] (prep_new_page > mm/page_alloc.c:2311 [inline]) > [<8048c7c0>] (kernel_init_free_pages) from [<80492ce8>] > (get_page_from_freelist+0x163c/0x1698 mm/page_alloc.c:3951) > r10:df48d3f0 r9:82bf89c0 r8:df48d3f0 r7:000b r6:0002 r5:0001 > r4:df48d3f8 r3:0001 > [<804916ac>] (get_page_from_freelist) from [<804933c8>] > (__alloc_pages_nodemask+0x164/0x1850 mm/page_alloc.c:5001) > r10: r9:860a9a80 r8:00112cca r7:000b r6:0081 r5:0008 > r4: > [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (__alloc_pages > include/linux/gfp.h:525 [inline]) > [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (__alloc_pages_node > include/linux/gfp.h:538 [inline]) > [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (alloc_pages_node > include/linux/gfp.h:552 [inline]) > [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (alloc_pages > include/linux/gfp.h:571 [inline]) > [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (__page_cache_alloc > include/linux/pagemap.h:289 [inline]) > [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] > (page_cache_ra_unbounded+0xc4/0x294 mm/readahead.c:216) > r10:860a9a84 r9:860a9a80 r8:8588956c r7:000b r6:0107 r5:85889688 > r4:df48d3c0 > [<8042f034>] (page_cache_ra_unbounded) from [<8042f3c4>] > (do_page_cache_ra+0xfc/0x150 mm/readahead.c:267) > r10:860a99ac r9:0001 r8:0020 r7:8013 r6:8042f624 r5:85889688 > r4:860a9908 > [<8042f2c8>] (do_page_cache_ra) from [<8042f624>] > (ondemand_readahead+0x20c/0x47c mm/readahead.c:549) > r10:0001 r9:00fc r8:0020 r7:00dc r6:85889688 r5: > r4:85aa8ea0 > [<8042f418>] (ondemand_readahead) from [<8042f958>] (page_cache_async_ra > mm/readahead.c:607 [inline]) > [<8042f418>] (ondemand_readahead) from [<8042f958>] > (page_cache_async_ra+0xc4/0x110 mm/readahead.c:581) > r10:85889818 r9:df4a4be0 r8:860a9a80 r7:85889714 r6: r5:85889688 > r4:85aa8ea0 > [<8042f894>] (page_cache_async_ra) from [<80420d1c>] > (page_cache_async_readahead include/linux/pagemap.h:863 [inline]) > [<8042f894>] (page_cache_async_ra) from [<80420d1c>] (filemap_readahead > mm/filemap.c:2350 [inline]) > [<8042f894>] (page_cache_async_ra) from [<80420d1c>] > (filemap_get_pages+0x254/0x648 mm/filemap.c:2391) > r7:85889714 r6:00db r5:85889830 r4:00dc > [<80420ac8>] (filemap_get_pages) from [<804211d8>] (filemap_read+0xc8/0x4e0 > mm/filemap.c:2458) >
[syzbot] kernel panic: corrupted stack end in openat
Hello, syzbot found the following issue on: HEAD commit:1e28eed1 Linux 5.12-rc3 git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=167535e6d0 kernel config: https://syzkaller.appspot.com/x/.config?x=e0cee1f53de33ca3 dashboard link: https://syzkaller.appspot.com/bug?extid=0b06ef9b44d00d600183 userspace arch: arm Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+0b06ef9b44d00d600...@syzkaller.appspotmail.com Kernel panic - not syncing: corrupted stack end detected inside scheduler CPU: 0 PID: 3263 Comm: syz-fuzzer Not tainted 5.12.0-rc3-syzkaller #0 Hardware name: ARM-Versatile Express Backtrace: [<81802700>] (dump_backtrace) from [<81802974>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) r7:0080 r6:6093 r5: r4:82b58544 [<8180295c>] (show_stack) from [<8180a048>] (__dump_stack lib/dump_stack.c:79 [inline]) [<8180295c>] (show_stack) from [<8180a048>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120) [<81809f90>] (dump_stack) from [<81803508>] (panic+0x130/0x378 kernel/panic.c:231) r7:81f4bdc0 r6:82a392a4 r5: r4:82c6b0d0 [<818033d8>] (panic) from [<81820270>] (schedule_debug kernel/sched/core.c:4822 [inline]) [<818033d8>] (panic) from [<81820270>] (__schedule+0xcb8/0xcc8 kernel/sched/core.c:4967) r3:57ac6e9d r2:0004 r1:81f5a53c r0:81f4bdc0 r7:0001 [<8181f5b8>] (__schedule) from [<8182046c>] (preempt_schedule_common+0x3c/0xac kernel/sched/core.c:5233) r10:071f r9:ffefd000 r8:0001 r7:81820510 r6:0001 r5:81820510 r4:85888000 [<81820430>] (preempt_schedule_common) from [<81820510>] (preempt_schedule+0x34/0x38 kernel/sched/core.c:5258) r7:82c6a4e0 r6:0001 r5:85888000 r4:df48d420 [<818204dc>] (preempt_schedule) from [<8048c884>] (__kunmap_atomic include/linux/highmem-internal.h:114 [inline]) [<818204dc>] (preempt_schedule) from [<8048c884>] (clear_highpage include/linux/highmem.h:204 [inline]) [<818204dc>] (preempt_schedule) from [<8048c884>] (kernel_init_free_pages+0xc4/0xd0 mm/page_alloc.c:1212) [<8048c7c0>] (kernel_init_free_pages) from [<80492ce8>] (post_alloc_hook mm/page_alloc.c:2305 [inline]) [<8048c7c0>] (kernel_init_free_pages) from [<80492ce8>] (prep_new_page mm/page_alloc.c:2311 [inline]) [<8048c7c0>] (kernel_init_free_pages) from [<80492ce8>] (get_page_from_freelist+0x163c/0x1698 mm/page_alloc.c:3951) r10:df48d3f0 r9:82bf89c0 r8:df48d3f0 r7:000b r6:0002 r5:0001 r4:df48d3f8 r3:0001 [<804916ac>] (get_page_from_freelist) from [<804933c8>] (__alloc_pages_nodemask+0x164/0x1850 mm/page_alloc.c:5001) r10: r9:860a9a80 r8:00112cca r7:000b r6:0081 r5:0008 r4: [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (__alloc_pages include/linux/gfp.h:525 [inline]) [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (__alloc_pages_node include/linux/gfp.h:538 [inline]) [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (alloc_pages_node include/linux/gfp.h:552 [inline]) [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (alloc_pages include/linux/gfp.h:571 [inline]) [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (__page_cache_alloc include/linux/pagemap.h:289 [inline]) [<80493264>] (__alloc_pages_nodemask) from [<8042f0f8>] (page_cache_ra_unbounded+0xc4/0x294 mm/readahead.c:216) r10:860a9a84 r9:860a9a80 r8:8588956c r7:000b r6:0107 r5:85889688 r4:df48d3c0 [<8042f034>] (page_cache_ra_unbounded) from [<8042f3c4>] (do_page_cache_ra+0xfc/0x150 mm/readahead.c:267) r10:860a99ac r9:0001 r8:0020 r7:8013 r6:8042f624 r5:85889688 r4:860a9908 [<8042f2c8>] (do_page_cache_ra) from [<8042f624>] (ondemand_readahead+0x20c/0x47c mm/readahead.c:549) r10:0001 r9:00fc r8:0020 r7:00dc r6:85889688 r5: r4:85aa8ea0 [<8042f418>] (ondemand_readahead) from [<8042f958>] (page_cache_async_ra mm/readahead.c:607 [inline]) [<8042f418>] (ondemand_readahead) from [<8042f958>] (page_cache_async_ra+0xc4/0x110 mm/readahead.c:581) r10:85889818 r9:df4a4be0 r8:860a9a80 r7:85889714 r6: r5:85889688 r4:85aa8ea0 [<8042f894>] (page_cache_async_ra) from [<80420d1c>] (page_cache_async_readahead include/linux/pagemap.h:863 [inline]) [<8042f894>] (page_cache_async_ra) from [<80420d1c>] (filemap_readahead mm/filemap.c:2350 [inline]) [<8042f894>] (page_cache_async_ra) from [<80420d1c>] (filemap_get_pages+0x254/0x648 mm/filemap.c:2391) r7:85889714 r6:00db r5:85889830 r4:00dc [<80420ac8>] (filemap_get_pages) from [<804211d8>] (filemap_read+0xc8/0x4e0 mm/filemap.c:2458) r10:85889818 r9:860a9908 r8:805ff484 r7:85889830 r6: r5:85889818 r4:85889830 [<80421110>] (filemap_read) from [<80421788>] (generic_file_read_iter+0x198/0x234 mm/filemap.c:2609) r10:1000 r9: r8:805ff484 r7:1000 r6: r5:85889818 r4:85889830 [<804215f0>] (generic_file_read_iter) from [<805ff484>]