Re: Kernel panic not printing a call trace?
(Sorry for the messy quoting, I'm not actually on the list so I didn't see this reply until I thought to check the ML archives) > If the stack is corrupted the backtrace may or may not be affected. Sure, but it happening every time is pretty surprising to me. > Why not bisect the kernel to find the actual bug? A) I'm going to try booting variously old versions of the kernel, but... B) I don't actually know that there was a version where the problem I'm encountering didn't exist, so it's a relatively open search, and C) Actually compiling kernels on this hardware will take an age each time, so I was hoping to get better insight into the bug through a stacktrace. - Rich On Wed, May 12, 2021 at 10:49 PM Rich wrote: > > Hi all, > So, I got my earlier system running sparc64 using a terrible method > (from inside the existing sparc install, mount -o remount,ro /; nc -l > | dd of=/dev/sda [...] an image generated in a VM, reboot and pray), > but now I'm doing the thing I actually wanted a sparc64 system for > (testing a kernel module on sparc64), and encountering a problem. > > While running through its test suite, when it runs through a certain > suite of tests, every time (so far) it dies in the same annoying > fashion: > [ 1435.191913] Kernel panic - not syncing: corrupted stack end > detected inside scheduler > [ 1435.294939] CPU: 0 PID: 722 Comm: spl_system_task Tainted: P >OE 5.10.0-6-sparc64 #1 Debian 5.10.28-1 > [ 1435.431126] Call Trace: > [ 1435.463267] Press Stop-A (L1-A) from sun keyboard or send break > [ 1435.463267] twice on console to return to the boot prom > [ 1435.609777] ---[ end Kernel panic - not syncing: corrupted stack > end detected inside scheduler ]--- > > RED State Exception > > TL=...0005 TT=...0010 >TPC=..0042.4200 TnPC=..0042.4204 TSTATE=..8000.1506 > TL=...0004 TT=...0010 >TPC=..0042.4200 TnPC=..0042.4204 TSTATE=..8000.1506 > TL=...0003 TT=...0010 >TPC=..0042.4200 TnPC=..0042.4204 TSTATE=..8000.1506 > TL=...0002 TT=...0010 >TPC=..0040.70d0 TnPC=..0040.70d4 TSTATE=..8004.1406 > TL=...0001 TT=...0068 >TPC=..0048.bba4 TnPC=..0048.bba8 TSTATE=..8000.1606 > > > Watchdog Reset > Externally Initiated Reset > ok > > (Sometimes, it winds up so disgruntled, the watchdog reset never > triggers, break twice on the console doesn't work, you need to > physically power cycle it.) > > I'm mostly curious about whether anyone knows why the Call Trace might > be empty - I see the message about corrupted stack end above it, but > from what I can see online, plenty of people get that message and a > call trace printout below it (...on other architectures, at least). > https://lists.debian.org/debian-sparc/2016/09/msg2.html is even an > example of someone on this very list. > > Does anyone have any insights? Or am I going to have to resort to > printks in random parts of the thread the panic notes and hope I find > the problem? > > Thanks! > - Rich
Re: Kernel panic not printing a call trace?
On 5/13/21 4:49 AM, Rich wrote: > I'm mostly curious about whether anyone knows why the Call Trace might > be empty - I see the message about corrupted stack end above it, but > from what I can see online, plenty of people get that message and a > call trace printout below it (...on other architectures, at least). > https://lists.debian.org/debian-sparc/2016/09/msg2.html is even an > example of someone on this very list. If the stack is corrupted the backtrace may or may not be affected. > Does anyone have any insights? Or am I going to have to resort to > printks in random parts of the thread the panic notes and hope I find > the problem? Why not bisect the kernel to find the actual bug? Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Kernel panic not printing a call trace?
Hi all, So, I got my earlier system running sparc64 using a terrible method (from inside the existing sparc install, mount -o remount,ro /; nc -l | dd of=/dev/sda [...] an image generated in a VM, reboot and pray), but now I'm doing the thing I actually wanted a sparc64 system for (testing a kernel module on sparc64), and encountering a problem. While running through its test suite, when it runs through a certain suite of tests, every time (so far) it dies in the same annoying fashion: [ 1435.191913] Kernel panic - not syncing: corrupted stack end detected inside scheduler [ 1435.294939] CPU: 0 PID: 722 Comm: spl_system_task Tainted: P OE 5.10.0-6-sparc64 #1 Debian 5.10.28-1 [ 1435.431126] Call Trace: [ 1435.463267] Press Stop-A (L1-A) from sun keyboard or send break [ 1435.463267] twice on console to return to the boot prom [ 1435.609777] ---[ end Kernel panic - not syncing: corrupted stack end detected inside scheduler ]--- RED State Exception TL=...0005 TT=...0010 TPC=..0042.4200 TnPC=..0042.4204 TSTATE=..8000.1506 TL=...0004 TT=...0010 TPC=..0042.4200 TnPC=..0042.4204 TSTATE=..8000.1506 TL=...0003 TT=...0010 TPC=..0042.4200 TnPC=..0042.4204 TSTATE=..8000.1506 TL=...0002 TT=...0010 TPC=..0040.70d0 TnPC=..0040.70d4 TSTATE=..8004.1406 TL=...0001 TT=...0068 TPC=..0048.bba4 TnPC=..0048.bba8 TSTATE=..8000.1606 Watchdog Reset Externally Initiated Reset ok (Sometimes, it winds up so disgruntled, the watchdog reset never triggers, break twice on the console doesn't work, you need to physically power cycle it.) I'm mostly curious about whether anyone knows why the Call Trace might be empty - I see the message about corrupted stack end above it, but from what I can see online, plenty of people get that message and a call trace printout below it (...on other architectures, at least). https://lists.debian.org/debian-sparc/2016/09/msg2.html is even an example of someone on this very list. Does anyone have any insights? Or am I going to have to resort to printks in random parts of the thread the panic notes and hope I find the problem? Thanks! - Rich