hi, On Sun, Aug 11, 2019 at 08:22:35AM +0200, Thomas Gleixner wrote: > On Sat, 10 Aug 2019, Paul Menzel wrote: > > Cc+ Steven > > > [+ INTEL IDLE DRIVER] > > > > Dear Linux folks, > > > > > > On 10.08.19 20:28, Paul Menzel wrote: > > > > > On 10.08.19 19:31, Thomas Gleixner wrote: > > > > > > > On Sat, 10 Aug 2019, Paul Menzel wrote: > > > > > > > > > > I have no idea, who to report this to, so I please refer me to the > > > > > correct > > > > > list. > > > > > > > > I have no idea yet either :) > > > > > > > > > With Linux 5.2.7 from Debian Sid/unstable and PowerTOP 2.10, executing > > > > > > > > > > sudo powertop --auto-tune > > > > > > > > > > causes a NULL pointer dereference, and the graphical session crashes > > > > > due > > > > > to an > > > > > effect on the i915 driver. It worked in the past with the 4.19 series > > > > > from > > > > > Debian. > > > > > > > > > > Here is the trace, and please find all Linux kernel logs attached. > > > > > > > > > > > [ 2027.170589] BUG: kernel NULL pointer dereference, address: > > > > > > 0000000000000000 > > > > > > [ 2027.170600] #PF: supervisor instruction fetch in kernel mode > > > > > > [ 2027.170604] #PF: error_code(0x0010) - not-present page > > > > > > [ 2027.170609] PGD 0 P4D 0 [ 2027.170619] Oops: 0010 [#1] SMP PTI > > > > ... > > > > > > [ 2027.170730] do_dentry_open+0x13a/0x370 > > > > > > > > If you have compiled with debug info, please decode the line: > > > > > > > > linux/scripts/faddr2line vmlinux do_dentry_open+0x13a/0x370 > > > > > > > > That gives us the fops pointer which is NULL. > > > > > > Hah, luckily it’s reproducible. > > > > > > ``` > > > $ scripts/faddr2line /usr/lib/debug/boot/vmlinux-5.2.0-2-amd64 > > > do_dentry_open+0x13a/0x370 > > > do_dentry_open+0x13a/0x370: > > > do_dentry_open at fs/open.c:799 > > > ``` > > > > > > > > > [ 2027.170745] path_openat+0x2c6/0x1480 > > > > > > [ 2027.170757] ? terminate_walk+0xe6/0x100 > > > > > > [ 2027.170767] ? path_lookupat.isra.48+0xa3/0x220 > > > > > > [ 2027.170779] ? reuse_swap_page+0x105/0x320 > > > > > > [ 2027.170791] do_filp_open+0x93/0x100 > > > > > > [ 2027.170804] ? __check_object_size+0x15d/0x189 > > > > > > [ 2027.170816] do_sys_open+0x184/0x220 > > > > > > [ 2027.170828] do_syscall_64+0x53/0x130 > > > > > > [ 2027.170837] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > > > That's an open crashing. We just don't know which file. Is the machine > > > > completely hosed after that or is it just the graphics stuff dying? > > > > > > No, the graphical login manager showed up, and I could log back in, and > > > continue using hte machine. > > > > > > > If it's not completely dead then instead of running it from your > > > > graphical > > > > desktop you could switch to a VGA terminal Alt+Ctrl+F1 (or whatever > > > > function key your distro maps to) after boot and run powertop with > > > > strace > > > > from there: > > > > > > > > strace -f -o xxx.log powertop > > > > > > > > With a bit of luck xxx.log should contain the information about the file > > > > it > > > > tries to open. > > > > > > ``` > > > 2157 access("/sys/class/drm/card0/power/rc6_residency_ms", R_OK) = 0 > > > 2157 openat(AT_FDCWD, > > > "/sys/kernel/debug/tracing/events/power/cpu_idle/id", > > > O_RDONLY) = ? > > > 2157 +++ killed by SIGKILL +++ > > > ``` > > > > > > > Alternatively if you have a serial console you can enable the > > > > sys_enter_open* tracepoints: > > > > > > > > # echo 1 >/sys/kernel/debug/tracing/events/syscalls/sys_enter_open > > > > # echo 1 >/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat > > > > > > > > Either add 'ftrace_dump_on_oops' to the kernel command line or enable it > > > > from the shell: > > > > > > > > # echo 1 > /proc/sys/kernel/ftrace_dump_on_oops > > > > > > > > Then run powertop. After the crash it will take some time to spill out > > > > the > > > > trace buffer over serial, but it will pinpoint the offending file. > > > > > > I do not have serial console on this device. > > > > For the record. It is also reproducible with Linux 5.2.6, and trying to > > print > > the file contents with cat already fails. > > > > ``` > > $ sudo ls -l /sys/kernel/debug/tracing/events/power/cpu_idle/id > > -r--r--r-- 1 root root 0 Aug 10 23:05 > > /sys/kernel/debug/tracing/events/power/cpu_idle/id > > $ sudo cat /sys/kernel/debug/tracing/events/power/cpu_idle/id > > Killed > > ```
This seems to be related to https://bugs.debian.org/934304 (in particular https://bugs.debian.org/934304#29). The mentioned patch features/all/lockdown/0031-tracefs-Restrict-tracefs-when-the-kernel-is-locked-d.patch is a backport of https://patchwork.kernel.org/patch/11069661/ with only change that it is converted back to the non-LSM lockdown API. Regards, Salvatore