Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Rasmus Villemoes (2021-03-04 15:11:47) > On 04/03/2021 20.15, Stephen Boyd wrote: > > Quoting Matthew Wilcox (2021-03-04 09:00:52) > >> On Mon, Mar 01, 2021 at 09:47:47AM -0800, Stephen Boyd wrote: > >>> Example: > >>> > >>> WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > >>> lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > >> > >> Would the first 12 characters instead of all 40 make it more palatable > >> without reducing its utility? > > > > I can't seem to request debuginfo from debuginfod without the full 40 > > characters. It's not a git sha1 hash. > > > >> And I feel it should be within the [], so maybe this: > >> > >> WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > >> lkdtm_WARNING+0x28/0x30 [lkdtm ed5019fdf5e5] > >> > > > > Sure I could put the hex numbers inside the brackets. I suspect changing > > %pS or updating the "Modules linked in:" line isn't going to be > > palatable. I've decided to introduce another printk format %pT to print > > the stacktrace > > Can you avoid claiming a new "top-level" %p modifier? Isn't it better to > add a new flag to '%pS', say '%pSb' to include build-id? > I see that %pSR is used in alpha for the stacktrace. I guess we can have %pSb and %pSr then.
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On 04/03/2021 20.15, Stephen Boyd wrote: > Quoting Matthew Wilcox (2021-03-04 09:00:52) >> On Mon, Mar 01, 2021 at 09:47:47AM -0800, Stephen Boyd wrote: >>> Example: >>> >>> WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 >>> lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e) >> >> Would the first 12 characters instead of all 40 make it more palatable >> without reducing its utility? > > I can't seem to request debuginfo from debuginfod without the full 40 > characters. It's not a git sha1 hash. > >> And I feel it should be within the [], so maybe this: >> >> WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 >> lkdtm_WARNING+0x28/0x30 [lkdtm ed5019fdf5e5] >> > > Sure I could put the hex numbers inside the brackets. I suspect changing > %pS or updating the "Modules linked in:" line isn't going to be > palatable. I've decided to introduce another printk format %pT to print > the stacktrace Can you avoid claiming a new "top-level" %p modifier? Isn't it better to add a new flag to '%pS', say '%pSb' to include build-id? Rasmus
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Matthew Wilcox (2021-03-04 09:00:52) > On Mon, Mar 01, 2021 at 09:47:47AM -0800, Stephen Boyd wrote: > > Example: > > > > WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > > lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > Would the first 12 characters instead of all 40 make it more palatable > without reducing its utility? I can't seem to request debuginfo from debuginfod without the full 40 characters. It's not a git sha1 hash. > And I feel it should be within the [], so maybe this: > > WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > lkdtm_WARNING+0x28/0x30 [lkdtm ed5019fdf5e5] > Sure I could put the hex numbers inside the brackets. I suspect changing %pS or updating the "Modules linked in:" line isn't going to be palatable. I've decided to introduce another printk format %pT to print the stacktrace and then updated the architecture code on arm64 and x86 to see how it goes. Other architecures can be updated if this is acceptable. I'll send out a patch series in a little bit that also updates the decode_stacktrace.sh script to parse this.
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Matthew Wilcox (2021-03-04 09:19:40) > On Mon, Mar 01, 2021 at 09:43:19PM -0500, Steven Rostedt wrote: > > On Mon, 1 Mar 2021 09:47:47 -0800 > > Stephen Boyd wrote: > > > > > The %pS printk format (among some others) is used to print kernel > > > addresses symbolically. When the kernel prints an address inside of a > > > module, the kernel prints the addresses' symbol name along with the > > > module's name that contains the address. Let's make kernel stacktraces > > > easier to identify on KALLSYMS builds by including the build ID of a > > > module when we print the address. > > > > Please no! > > > > This kills the output of tracing with offset, and can possibly break > > scripts. I don't want to look at traces like this! > > > > -0 [004] ..s1 353.842577: > > ipv4_conntrack_in+0x0/0x10 [nf_conntrack] > > (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_hook_slow+0x40/0xb0 > > Would it make sense to only print the build-id if it differs from the > build-id of the kernel which has loaded it? No. The build-id of the module is different from the kernel that loaded it all the time, so it would always be printed.
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Mon, Mar 01, 2021 at 09:43:19PM -0500, Steven Rostedt wrote: > On Mon, 1 Mar 2021 09:47:47 -0800 > Stephen Boyd wrote: > > > The %pS printk format (among some others) is used to print kernel > > addresses symbolically. When the kernel prints an address inside of a > > module, the kernel prints the addresses' symbol name along with the > > module's name that contains the address. Let's make kernel stacktraces > > easier to identify on KALLSYMS builds by including the build ID of a > > module when we print the address. > > Please no! > > This kills the output of tracing with offset, and can possibly break > scripts. I don't want to look at traces like this! > > -0 [004] ..s1 353.842577: ipv4_conntrack_in+0x0/0x10 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > <-nf_hook_slow+0x40/0xb0 Would it make sense to only print the build-id if it differs from the build-id of the kernel which has loaded it?
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Mon, Mar 01, 2021 at 09:47:47AM -0800, Stephen Boyd wrote: > Example: > > WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e) Would the first 12 characters instead of all 40 make it more palatable without reducing its utility? And I feel it should be within the [], so maybe this: WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 lkdtm_WARNING+0x28/0x30 [lkdtm ed5019fdf5e5]
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Steven Rostedt (2021-03-03 17:19:32) > On Wed, 03 Mar 2021 16:38:28 -0800 > Stephen Boyd wrote: > > > I'm starting to feel like nobody read the commit text, or I messed up > > somehow and the commit text was confusing? :( > > > > I read it, I'm just unfamiliar with it. I don't use pstore, and I'm not > sure what "crashdump" is. Do you mean the kexec/kdump? in which case > you can retrieve data within the kernel quite easily. Right, I meant kexec/kdump. Given that it is easy to retrieve it in kdump (presumably with some scripting?) I can remove this motivation from the commit text. > > I haven't used debuginfod (never heard of it before actually). Got it. Hopefully the links I provided were good enough? I will provide a link next time. > > > │ This is especially helpful for crash debugging with pstore or crashdump > > > > > > │ kernels. If we have the build ID for the module in the stacktrace we can > > > > > > │ request the debug symbols for the module from a remote debuginfod server > > > > > > │ or parse stacktraces at a later time with decode_stacktrace.sh by > > > > > > │ downloading the correct symbols based on the build ID. This cuts down on > > > > > > │ the amount of time and effort needed to find the correct kernel modules > > > > > > │ for a stacktrace by encoding that information into it. > > Are you saying it's common to have modules from different builds? No. > > > > > In some distro (read: non-kernel dev) workflows the vmlinux isn't > > shipped on the device and crash handling is done offline or much later. > > Using the build ID[1] is a common way to identify the binary that's > > running on the device. In conjunction with a debuginfod[2] server you > > can download the symbols for a crash automatically if you have the build > > ID information. > > > > I can add a patch that updates decode_stacktrace.sh to show how it can > > download the correct vmlinux/modules if it isn't provided on the > > commandline. > > Are you just trying to match modules with the builds that they were > created with? Not exactly. I don't have a mapping of modules to the kernel they're built/used with. I could create a mapping, but then that's something else to maintain vs. what I have right now which is just a big database of debuginfo mapped to build IDs (i.e. a debuginfod server). > > > > > If the debug symbols are on some public server then in theory we could > > have some robot sitting on the mailing list that looks for stacktraces > > and automatically replies with information about the line number/file > > and even provides the code snippet for the code that's crashing from > > that binary, because it's all stored in the full debuginfo builds. > > Again, I have no idea how buildids are created or what they are used > for. This is the first time I've even heard about them. I'm all for > helping other people out to make their workflow easier, if it doesn't > make a mess for everyone else. > > Makes sense and sounds good. Thanks.
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Wed, 03 Mar 2021 16:38:28 -0800 Stephen Boyd wrote: > I'm starting to feel like nobody read the commit text, or I messed up > somehow and the commit text was confusing? :( > I read it, I'm just unfamiliar with it. I don't use pstore, and I'm not sure what "crashdump" is. Do you mean the kexec/kdump? in which case you can retrieve data within the kernel quite easily. I haven't used debuginfod (never heard of it before actually). > │ This is especially helpful for crash debugging with pstore or crashdump > > > │ kernels. If we have the build ID for the module in the stacktrace we can > > > │ request the debug symbols for the module from a remote debuginfod server > > > │ or parse stacktraces at a later time with decode_stacktrace.sh by > > > │ downloading the correct symbols based on the build ID. This cuts down on > > > │ the amount of time and effort needed to find the correct kernel modules > > > │ for a stacktrace by encoding that information into it. Are you saying it's common to have modules from different builds? > > In some distro (read: non-kernel dev) workflows the vmlinux isn't > shipped on the device and crash handling is done offline or much later. > Using the build ID[1] is a common way to identify the binary that's > running on the device. In conjunction with a debuginfod[2] server you > can download the symbols for a crash automatically if you have the build > ID information. > > I can add a patch that updates decode_stacktrace.sh to show how it can > download the correct vmlinux/modules if it isn't provided on the > commandline. Are you just trying to match modules with the builds that they were created with? > > If the debug symbols are on some public server then in theory we could > have some robot sitting on the mailing list that looks for stacktraces > and automatically replies with information about the line number/file > and even provides the code snippet for the code that's crashing from > that binary, because it's all stored in the full debuginfo builds. Again, I have no idea how buildids are created or what they are used for. This is the first time I've even heard about them. I'm all for helping other people out to make their workflow easier, if it doesn't make a mess for everyone else. -- Steve
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Petr Mladek (2021-03-03 02:25:58) > On Mon 2021-03-01 09:47:47, Stephen Boyd wrote: > > The %pS printk format (among some others) is used to print kernel > > addresses symbolically. When the kernel prints an address inside of a > > module, the kernel prints the addresses' symbol name along with the > > module's name that contains the address. Let's make kernel stacktraces > > easier to identify on KALLSYMS builds by including the build ID of a > > module when we print the address. > > > > This is especially helpful for crash debugging with pstore or crashdump > > kernels. If we have the build ID for the module in the stacktrace we can > > request the debug symbols for the module from a remote debuginfod server > > I have read the thread so for. IMHO, all mentioned variants complicate > the logs a lot. Either the backtrace lines are hard to parse. > Or the OOps/panic blocks gets too long when the ID is mentioned > for every loaded module. IMHO, neither proposed solution > is acceptable to be always used. The modules line is already pretty hard to read once it gets beyond 8 or 10 modules. Probably it should be broken up into multiple lines just so it can be read more easily. I find myself having to hunt in there already even without the build ID encoded there. I've also seen folks cut out that line in commit text and when posting to the mailing list because it's just long and noisy already. > > First, I think that only I some developers would actually use > this information. Many of them either know what module was > used or they do not have an easy way to get the debugging > information by the ID. So, it should be configurable > at minimum. It should be configurable because it isn't used or is hard to use? Wouldn't a config variable limit the uptake of this information and further reinforce the fact that nobody will use it? If we always exposed it then maybe we would get new users. I imagine that we could have a robot search crash reports and find the "crashiest" places in the kernel all the way down to the line level and poke kernel developers to look and see why it crashes so much there. If the information is behind a config then the benefit of that analysis will be limited. > > Second, I am not sure that it should be part of each OOps/panic blob. > One solution would be to print the ID by the module loader when > the module gets loaded. It would be mentioned earlier in the log > then. Please no. The kernel log could wrap around before an oops/panic happens and then the ID would be lost. > > Or we could make it available, for example, via /proc. It is a kind > of information that might be gathered later on a rebooted system. > SUSE has "supportconfig" command that allows to gather a lot > of similar information about the system. We use it when > analyzing crashdumps and kernel bugs in general. Please no. The build ID is already available via /sys/module//sections/ and /sys/kernel/notes (for vmlinux) but that won't help for panics that reboot. If a panic happens and a new kernel is booted then post processing on the modules and vmlinux could all be incorrect. Furthermore, the modules will have to be found and parsed by some userspace crash processing tool after the reboot. I'd really prefer to rely on the standard build ID vs. a per-distro bespoke solution to finding the information about the binaries the kernel was running. It's just simpler to not need to find out this sort of information about the system when the kernel knows what binary is running already. This is the same reason coredump_filter exists to let coredumping code figure out the build ID of the process that crashes vs. connecting that to some system information daemon. > > Anyway, I consider this a very detailed information that is not > suitable for everyone. It should be availabe on request, like > for example, backtraces from all CPUs, list of tasks, memory info. I suppose I can make a config option for this module printing stuff in the "Modules linked in:" line. Then we can remove it if most distros decide to enable it? I'm slightly disappointed that it can't just be printed all the time for every stacktrace but if there isn't opposition if it's all behind a config option then I will be happy to get 99% of the code upstream.
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Andy Shevchenko (2021-03-03 08:17:01) > On Wed, Mar 03, 2021 at 10:00:12AM -0500, Steven Rostedt wrote: > > On Wed, 3 Mar 2021 11:25:58 +0100 > > Petr Mladek wrote: > > > > > Alternative solution would be to minimize the information, for > > > example, by printing only the modules that appear in the backtrace. > > > But this might be complicated to implement. > > > > It could be a list after the backtrace perhaps, and not part of the > > "modules linked in"? > > > > But then you need a generic way of capturing those modules in the backtrace > > that works for every architecture. Right, and doing that is sort of complicated for something that really shouldn't need to be complicated. We're printing out information about a crash/hang/bug and that should be fast and not too computationally intensive so that the stacktrace can be printed before everything starts falling apart. I'd rather not save things away while processing the stacktrace and then print more info after the stacktrace. Seems fragile. > > > Honestly, I don't even know what a buildid is, and it is totally useless > > information for myself. What exactly is it used for? > > Dunno Stephen's motivation, but build ID is very useful when you do tracing, > then based on ID the decoders can know what exactly was the layout of the > binary and list of (exported) functions, etc. > > At least that was my (shallow) experience with perf last time I have tried it. > I'm starting to feel like nobody read the commit text, or I messed up somehow and the commit text was confusing? :( │ This is especially helpful for crash debugging with pstore or crashdump │ kernels. If we have the build ID for the module in the stacktrace we can │ request the debug symbols for the module from a remote debuginfod server │ or parse stacktraces at a later time with decode_stacktrace.sh by │ downloading the correct symbols based on the build ID. This cuts down on │ the amount of time and effort needed to find the correct kernel modules │ for a stacktrace by encoding that information into it. In some distro (read: non-kernel dev) workflows the vmlinux isn't shipped on the device and crash handling is done offline or much later. Using the build ID[1] is a common way to identify the binary that's running on the device. In conjunction with a debuginfod[2] server you can download the symbols for a crash automatically if you have the build ID information. I can add a patch that updates decode_stacktrace.sh to show how it can download the correct vmlinux/modules if it isn't provided on the commandline. If the debug symbols are on some public server then in theory we could have some robot sitting on the mailing list that looks for stacktraces and automatically replies with information about the line number/file and even provides the code snippet for the code that's crashing from that binary, because it's all stored in the full debuginfo builds. [1] https://fedoraproject.org/wiki/RolandMcGrath/BuildID [2] https://sourceware.org/elfutils/Debuginfod.html
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Wed, Mar 03, 2021 at 10:00:12AM -0500, Steven Rostedt wrote: > On Wed, 3 Mar 2021 11:25:58 +0100 > Petr Mladek wrote: > > > Alternative solution would be to minimize the information, for > > example, by printing only the modules that appear in the backtrace. > > But this might be complicated to implement. > > It could be a list after the backtrace perhaps, and not part of the > "modules linked in"? > > But then you need a generic way of capturing those modules in the backtrace > that works for every architecture. > Honestly, I don't even know what a buildid is, and it is totally useless > information for myself. What exactly is it used for? Dunno Stephen's motivation, but build ID is very useful when you do tracing, then based on ID the decoders can know what exactly was the layout of the binary and list of (exported) functions, etc. At least that was my (shallow) experience with perf last time I have tried it. -- With Best Regards, Andy Shevchenko
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Wed, 3 Mar 2021 11:25:58 +0100 Petr Mladek wrote: > Alternative solution would be to minimize the information, for > example, by printing only the modules that appear in the backtrace. > But this might be complicated to implement. It could be a list after the backtrace perhaps, and not part of the "modules linked in"? But then you need a generic way of capturing those modules in the backtrace that works for every architecture. Honestly, I don't even know what a buildid is, and it is totally useless information for myself. What exactly is it used for? -- Steve
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Tue, Mar 02, 2021 at 07:00:32PM -0800, Stephen Boyd wrote: > Quoting Steven Rostedt (2021-03-02 18:01:36) > > On Mon, 1 Mar 2021 09:47:47 -0800 > > Stephen Boyd wrote: > > > > > WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > > > lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > > Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg > > > xt_cgroup uinput xt_MASQUERADE > > > CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 > > > aa23f7a1231c229de205662d5a9e0d4c580f19a1 > > > Hardware name: Google Lazor (rev3+) with KB Backlight (DT) > > > pstate: 0049 (nzcv daif +PAN -UAO -TCO BTYPE=--) > > > pc : lkdtm_WARNING+0x28/0x30 [lkdtm] > > > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > > lr : lkdtm_do_action+0x24/0x40 [lkdtm] > > > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > > > Why not just change the "Modules linked in:" portion to show the build IDs > > instead of every function call? Maybe make it a config option to do so? > > > > As I wrote in the commit text > > │ An alternative to changing the printk format would be to update the > > > │ "Modules linked in:" line to include the build ID of each module linked > > > │ in. This can become quite long when many modules are loaded (i.e. on a > > > │ distro) so I've opted for the printk format instead. > > Implementing it is fairly simple compared to a new printk format. In > fact I did that initially but decided against it because it felt > unreadable and provided more information than was necessary if the > stacktrace wasn't in a module. > > Example: > > Modules linked in: rfcomm 4de3e669b9fee915 algif_hash 915c61c32d476f01 > algif_skcipher 53a4a330149bf071 af_alg b49d66496907f376 xt_cgroup > 36fbbb65e7620df9 uinput a0ff6a06c1c53685 xt_MASQUERADE d130585f728315d2 > snd_soc_sc7180 5c130cd310c81a12 venus_dec 2071e263fde9ca07 > qcom_spmi_temp_alarm 41e28f2a9cc8b562 qcom_spmi_adc_tm5 7e0e48d0b6550c7a > qcom_spmi_adc5 40b81a00bc2d0c1d qcom_vadc_common 09bb012dd1662dea > snd_soc_rt5682_i2c c92b8935ad4a1729 venus_enc efaf3335b88287bf > snd_soc_qcom_common 56d1e87c267ed02e videobuf2_dma_sg e9bc3c9e2758dfc9 > snd_soc_rt5682 c4c9b31bf9a43f8b snd_soc_rl6231 908446e32436898c hci_uart > 107b5fe6884df077 btqca 401a2fcc17b80a39 bluetooth 3369c881496a3cf8 venus_core > a11eee3aa201ad8e ecdh_generic 883e01f98b778108 ecc 226b258cf40ad1ba > v4l2_mem2mem 592f197cbad39e6b snd_soc_lpass_sc7180 42a99cb812d5e2e3 > snd_soc_lpass_hdmi 95cce2160cfc58e2 snd_soc_lpass_cpu 475a4b395577affd > snd_soc_lpass_platform 67517083263035ec snd_soc_max98357a 03ec1af0493d1c59 > fuse 82d170244a4d4ac6 iio_trig_sysfs b879a6228e059834 cros_ec_lid_angle > a713c5f0a06a7a82 cros_ec_sensors 03f0c142ec521f67 cros_ec_sensors_core > ada3f44647980056 cros_ec_sensors_ring f99590e87e998485 > industrialio_triggered_buffer c697969d67f73d77 kfifo_buf e8a3aeb3069188f0 > cros_ec_sensorhub 041ed1ffcefc991b ath10k_snoc 2f60802a74ff6ca7 lzo_rle > 49a6228cec0c6885 ath10k_core 9407c36a2b8f8fae lzo_compress e9b4375d4bad668a > ath 20c585ba6f3998f0 zram c5cdfc1d7d8a35d9 mac80211 9eaed1c76e000c02 cfg80211 > 872178d2dcebb722 cdc_ether 2baa214f72289402 usbnet 2bec73d0922ade28 uvcvideo > eee0352cb1846ee4 r8152 47b76561b78e9b1b mii b306d150923fe614 > videobuf2_vmalloc 55431ec52fa6af84 videobuf2_memops 4d43ad18fa7b1e4e > videobuf2_v4l2 cdda06b9d95ab11d videobuf2_common 69cef0ca55a70a4a joydev > 148399325b72d4a0 > > compared to > > Modules linked in: rfcomm algif_hash algif_skcipher af_alg xt_cgroup uinput > xt_MASQUERADE snd_soc_sc7180 venus_dec qcom_spmi_temp_alarm qcom_spmi_adc_tm5 > qcom_spmi_adc5 qcom_vadc_common snd_soc_rt5682_i2c venus_enc > snd_soc_qcom_common videobuf2_dma_sg snd_soc_rt5682 snd_soc_rl6231 hci_uart > btqca bluetooth venus_core ecdh_generic ecc v4l2_mem2mem snd_soc_lpass_sc7180 > snd_soc_lpass_hdmi snd_soc_lpass_cpu snd_soc_lpass_platform snd_soc_max98357a > fuse iio_trig_sysfs cros_ec_lid_angle cros_ec_sensors cros_ec_sensors_core > cros_ec_sensors_ring industrialio_triggered_buffer kfifo_buf > cros_ec_sensorhub ath10k_snoc lzo_rle ath10k_core lzo_compress ath zram > mac80211 cfg80211 cdc_ether usbnet uvcvideo r8152 mii videobuf2_vmalloc > videobuf2_memops videobuf2_v4l2 videobuf2_common joydev > I suppose it could be surrounded by parenthesis or brackets and that > would be a little better? I'll try this approach again and see what > folks think. But you can do: Modules linked: module1 [buildidX] module2 [buildidY] module3 [buildidZ] ... i.o.w. one module per
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Mon 2021-03-01 09:47:47, Stephen Boyd wrote: > The %pS printk format (among some others) is used to print kernel > addresses symbolically. When the kernel prints an address inside of a > module, the kernel prints the addresses' symbol name along with the > module's name that contains the address. Let's make kernel stacktraces > easier to identify on KALLSYMS builds by including the build ID of a > module when we print the address. > > This is especially helpful for crash debugging with pstore or crashdump > kernels. If we have the build ID for the module in the stacktrace we can > request the debug symbols for the module from a remote debuginfod server I have read the thread so for. IMHO, all mentioned variants complicate the logs a lot. Either the backtrace lines are hard to parse. Or the OOps/panic blocks gets too long when the ID is mentioned for every loaded module. IMHO, neither proposed solution is acceptable to be always used. First, I think that only I some developers would actually use this information. Many of them either know what module was used or they do not have an easy way to get the debugging information by the ID. So, it should be configurable at minimum. Second, I am not sure that it should be part of each OOps/panic blob. One solution would be to print the ID by the module loader when the module gets loaded. It would be mentioned earlier in the log then. Or we could make it available, for example, via /proc. It is a kind of information that might be gathered later on a rebooted system. SUSE has "supportconfig" command that allows to gather a lot of similar information about the system. We use it when analyzing crashdumps and kernel bugs in general. Anyway, I consider this a very detailed information that is not suitable for everyone. It should be availabe on request, like for example, backtraces from all CPUs, list of tasks, memory info. Alternative solution would be to minimize the information, for example, by printing only the modules that appear in the backtrace. But this might be complicated to implement. Best Regards, Petr
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Andy Shevchenko (2021-03-03 00:19:05) > On Tue, Mar 02, 2021 at 07:00:32PM -0800, Stephen Boyd wrote: > > Quoting Steven Rostedt (2021-03-02 18:01:36) > > > On Mon, 1 Mar 2021 09:47:47 -0800 > > > Stephen Boyd wrote: > > > > > > > WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > > > > lkdtm_WARNING+0x28/0x30 [lkdtm] > > > > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > > > Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg > > > > xt_cgroup uinput xt_MASQUERADE > > > > CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 > > > > aa23f7a1231c229de205662d5a9e0d4c580f19a1 > > > > Hardware name: Google Lazor (rev3+) with KB Backlight (DT) > > > > pstate: 0049 (nzcv daif +PAN -UAO -TCO BTYPE=--) > > > > pc : lkdtm_WARNING+0x28/0x30 [lkdtm] > > > > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > > > lr : lkdtm_do_action+0x24/0x40 [lkdtm] > > > > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > > > > > Why not just change the "Modules linked in:" portion to show the build IDs > > > instead of every function call? Maybe make it a config option to do so? > > > > > > > As I wrote in the commit text > > > > │ An alternative to changing the printk format would be to update the > > > > > > │ "Modules linked in:" line to include the build ID of each module linked > > > > > > │ in. This can become quite long when many modules are loaded (i.e. on a > > > > > > │ distro) so I've opted for the printk format instead. > > > > Implementing it is fairly simple compared to a new printk format. In > > fact I did that initially but decided against it because it felt > > unreadable and provided more information than was necessary if the > > stacktrace wasn't in a module. > > > > Example: > > > > Modules linked in: rfcomm 4de3e669b9fee915 algif_hash 915c61c32d476f01 > > algif_skcipher 53a4a330149bf071 af_alg b49d66496907f376 xt_cgroup > > 36fbbb65e7620df9 uinput a0ff6a06c1c53685 xt_MASQUERADE d130585f728315d2 > > snd_soc_sc7180 5c130cd310c81a12 venus_dec 2071e263fde9ca07 > > qcom_spmi_temp_alarm 41e28f2a9cc8b562 qcom_spmi_adc_tm5 7e0e48d0b6550c7a > > qcom_spmi_adc5 40b81a00bc2d0c1d qcom_vadc_common 09bb012dd1662dea > > snd_soc_rt5682_i2c c92b8935ad4a1729 venus_enc efaf3335b88287bf > > snd_soc_qcom_common 56d1e87c267ed02e videobuf2_dma_sg e9bc3c9e2758dfc9 > > snd_soc_rt5682 c4c9b31bf9a43f8b snd_soc_rl6231 908446e32436898c hci_uart > > 107b5fe6884df077 btqca 401a2fcc17b80a39 bluetooth 3369c881496a3cf8 > > venus_core a11eee3aa201ad8e ecdh_generic 883e01f98b778108 ecc > > 226b258cf40ad1ba v4l2_mem2mem 592f197cbad39e6b snd_soc_lpass_sc7180 > > 42a99cb812d5e2e3 snd_soc_lpass_hdmi 95cce2160cfc58e2 snd_soc_lpass_cpu > > 475a4b395577affd snd_soc_lpass_platform 67517083263035ec snd_soc_max98357a > > 03ec1af0493d1c59 > > fuse 82d170244a4d4ac6 iio_trig_sysfs b879a6228e059834 cros_ec_lid_angle > > a713c5f0a06a7a82 cros_ec_sensors 03f0c142ec521f67 cros_ec_sensors_core > > ada3f44647980056 cros_ec_sensors_ring f99590e87e998485 > > industrialio_triggered_buffer c697969d67f73d77 kfifo_buf e8a3aeb3069188f0 > > cros_ec_sensorhub 041ed1ffcefc991b ath10k_snoc 2f60802a74ff6ca7 lzo_rle > > 49a6228cec0c6885 ath10k_core 9407c36a2b8f8fae lzo_compress e9b4375d4bad668a > > ath 20c585ba6f3998f0 zram c5cdfc1d7d8a35d9 mac80211 9eaed1c76e000c02 > > cfg80211 872178d2dcebb722 cdc_ether 2baa214f72289402 usbnet > > 2bec73d0922ade28 uvcvideo eee0352cb1846ee4 r8152 47b76561b78e9b1b mii > > b306d150923fe614 videobuf2_vmalloc 55431ec52fa6af84 videobuf2_memops > > 4d43ad18fa7b1e4e videobuf2_v4l2 cdda06b9d95ab11d videobuf2_common > > 69cef0ca55a70a4a joydev 148399325b72d4a0 > > > > compared to > > > > Modules linked in: rfcomm algif_hash algif_skcipher af_alg xt_cgroup > > uinput xt_MASQUERADE snd_soc_sc7180 venus_dec qcom_spmi_temp_alarm > > qcom_spmi_adc_tm5 qcom_spmi_adc5 qcom_vadc_common snd_soc_rt5682_i2c > > venus_enc snd_soc_qcom_common videobuf2_dma_sg snd_soc_rt5682 > > snd_soc_rl6231 hci_uart btqca bluetooth venus_core ecdh_generic ecc > > v4l2_mem2mem snd_soc_lpass_sc7180 snd_soc_lpass_hdmi snd_soc_lpass_cpu > > snd_soc_lpass_platform snd_soc_max98357a > > fuse iio_trig_sysfs cros_ec_lid_angle cros_ec_sensors > > cros_ec_sensors_core cros_ec_sensors_ring industrialio_triggered_buffer > > kfifo_buf cros_ec_sensorhub ath10k_snoc lzo_rle ath10k_core lzo_compress > > ath zram mac80211 cfg80211 cdc_ether usbnet uvcvideo r8152 mii > > videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common joydev > > I suppose it could be surrounded by
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Steven Rostedt (2021-03-02 18:01:36) > On Mon, 1 Mar 2021 09:47:47 -0800 > Stephen Boyd wrote: > > > WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > > lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg xt_cgroup > > uinput xt_MASQUERADE > > CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 > > aa23f7a1231c229de205662d5a9e0d4c580f19a1 > > Hardware name: Google Lazor (rev3+) with KB Backlight (DT) > > pstate: 0049 (nzcv daif +PAN -UAO -TCO BTYPE=--) > > pc : lkdtm_WARNING+0x28/0x30 [lkdtm] > > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > lr : lkdtm_do_action+0x24/0x40 [lkdtm] > > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > > Why not just change the "Modules linked in:" portion to show the build IDs > instead of every function call? Maybe make it a config option to do so? > As I wrote in the commit text │ An alternative to changing the printk format would be to update the │ "Modules linked in:" line to include the build ID of each module linked │ in. This can become quite long when many modules are loaded (i.e. on a │ distro) so I've opted for the printk format instead. Implementing it is fairly simple compared to a new printk format. In fact I did that initially but decided against it because it felt unreadable and provided more information than was necessary if the stacktrace wasn't in a module. Example: Modules linked in: rfcomm 4de3e669b9fee915 algif_hash 915c61c32d476f01 algif_skcipher 53a4a330149bf071 af_alg b49d66496907f376 xt_cgroup 36fbbb65e7620df9 uinput a0ff6a06c1c53685 xt_MASQUERADE d130585f728315d2 snd_soc_sc7180 5c130cd310c81a12 venus_dec 2071e263fde9ca07 qcom_spmi_temp_alarm 41e28f2a9cc8b562 qcom_spmi_adc_tm5 7e0e48d0b6550c7a qcom_spmi_adc5 40b81a00bc2d0c1d qcom_vadc_common 09bb012dd1662dea snd_soc_rt5682_i2c c92b8935ad4a1729 venus_enc efaf3335b88287bf snd_soc_qcom_common 56d1e87c267ed02e videobuf2_dma_sg e9bc3c9e2758dfc9 snd_soc_rt5682 c4c9b31bf9a43f8b snd_soc_rl6231 908446e32436898c hci_uart 107b5fe6884df077 btqca 401a2fcc17b80a39 bluetooth 3369c881496a3cf8 venus_core a11eee3aa201ad8e ecdh_generic 883e01f98b778108 ecc 226b258cf40ad1ba v4l2_mem2mem 592f197cbad39e6b snd_soc_lpass_sc7180 42a99cb812d5e2e3 snd_soc_lpass_hdmi 95cce2160cfc58e2 snd_soc_lpass_cpu 475a4b395577affd snd_soc_lpass_platform 67517083263035ec snd_soc_max98357a 03ec1af0493d1c59 fuse 82d170244a4d4ac6 iio_trig_sysfs b879a6228e059834 cros_ec_lid_angle a713c5f0a06a7a82 cros_ec_sensors 03f0c142ec521f67 cros_ec_sensors_core ada3f44647980056 cros_ec_sensors_ring f99590e87e998485 industrialio_triggered_buffer c697969d67f73d77 kfifo_buf e8a3aeb3069188f0 cros_ec_sensorhub 041ed1ffcefc991b ath10k_snoc 2f60802a74ff6ca7 lzo_rle 49a6228cec0c6885 ath10k_core 9407c36a2b8f8fae lzo_compress e9b4375d4bad668a ath 20c585ba6f3998f0 zram c5cdfc1d7d8a35d9 mac80211 9eaed1c76e000c02 cfg80211 872178d2dcebb722 cdc_ether 2baa214f72289402 usbnet 2bec73d0922ade28 uvcvideo eee0352cb1846ee4 r8152 47b76561b78e9b1b mii b306d150923fe614 videobuf2_vmalloc 55431ec52fa6af84 videobuf2_memops 4d43ad18fa7b1e4e videobuf2_v4l2 cdda06b9d95ab11d videobuf2_common 69cef0ca55a70a4a joydev 148399325b72d4a0 compared to Modules linked in: rfcomm algif_hash algif_skcipher af_alg xt_cgroup uinput xt_MASQUERADE snd_soc_sc7180 venus_dec qcom_spmi_temp_alarm qcom_spmi_adc_tm5 qcom_spmi_adc5 qcom_vadc_common snd_soc_rt5682_i2c venus_enc snd_soc_qcom_common videobuf2_dma_sg snd_soc_rt5682 snd_soc_rl6231 hci_uart btqca bluetooth venus_core ecdh_generic ecc v4l2_mem2mem snd_soc_lpass_sc7180 snd_soc_lpass_hdmi snd_soc_lpass_cpu snd_soc_lpass_platform snd_soc_max98357a fuse iio_trig_sysfs cros_ec_lid_angle cros_ec_sensors cros_ec_sensors_core cros_ec_sensors_ring industrialio_triggered_buffer kfifo_buf cros_ec_sensorhub ath10k_snoc lzo_rle ath10k_core lzo_compress ath zram mac80211 cfg80211 cdc_ether usbnet uvcvideo r8152 mii videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common joydev I suppose it could be surrounded by parenthesis or brackets and that would be a little better? I'll try this approach again and see what folks think.
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Mon, 1 Mar 2021 09:47:47 -0800 Stephen Boyd wrote: > WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 > lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg xt_cgroup > uinput xt_MASQUERADE > CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 > aa23f7a1231c229de205662d5a9e0d4c580f19a1 > Hardware name: Google Lazor (rev3+) with KB Backlight (DT) > pstate: 0049 (nzcv daif +PAN -UAO -TCO BTYPE=--) > pc : lkdtm_WARNING+0x28/0x30 [lkdtm] > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) > lr : lkdtm_do_action+0x24/0x40 [lkdtm] > (ed5019fdf5e53be37cb1ba7899292d7e143b259e) Why not just change the "Modules linked in:" portion to show the build IDs instead of every function call? Maybe make it a config option to do so? -- Steve
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
Quoting Steven Rostedt (2021-03-01 18:43:19) > On Mon, 1 Mar 2021 09:47:47 -0800 > Stephen Boyd wrote: > > > The %pS printk format (among some others) is used to print kernel > > addresses symbolically. When the kernel prints an address inside of a > > module, the kernel prints the addresses' symbol name along with the > > module's name that contains the address. Let's make kernel stacktraces > > easier to identify on KALLSYMS builds by including the build ID of a > > module when we print the address. > > Please no! > > This kills the output of tracing with offset, and can possibly break > scripts. I don't want to look at traces like this! > > -0 [004] ..s1 353.842577: ipv4_conntrack_in+0x0/0x10 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > <-nf_hook_slow+0x40/0xb0 > -0 [004] ..s1 353.842577: nf_conntrack_in+0x0/0x5c0 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > <-nf_hook_slow+0x40/0xb0 > -0 [004] ..s1 353.842577: get_l4proto+0x0/0x190 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > <-nf_conntrack_in+0x92/0x5c0 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) > -0 [004] ..s1 353.842577: nf_ct_get_tuple+0x0/0x240 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > <-nf_conntrack_in+0xec/0x5c0 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) > -0 [004] ..s1 353.842577: > hash_conntrack_raw+0x0/0x170 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x28c/0x5c0 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > -0 [004] ..s1 353.842578: > __nf_conntrack_find_get.isra.0+0x0/0x2f0 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x29d/0x5c0 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > -0 [004] ..s1 353.842578: > nf_conntrack_tcp_packet+0x0/0x1760 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x3c8/0x5c0 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > -0 [004] ..s2 353.842578: nf_ct_seq_offset+0x0/0x40 > [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) > <-nf_conntrack_tcp_packet+0x26d/0x1760 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) > -0 [004] ..s1 353.842578: > __nf_ct_refresh_acct+0x0/0x50 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) > <-nf_conntrack_tcp_packet+0x558/0x1760 [nf_conntrack] > (3b39eb771b2566331887f671c741f90bfba0b051) > > NACK! > Heh, OK. Would adding a new printk format work then? From after the cut: > Yet another alternative is to add a printk format like %pSB for > backtrace prints. This would require a handful of architecture updates > and I'm not sure it's worth the effort for that. > Or maybe take the approach of putting all the linked in module build IDs in the "Modules linked in:" section? But as I said in the commit text that can become quite verbose. Looking forward to your suggestions here.
Re: [PATCH 5/7] printk: Make %pS and friends print module build ID
On Mon, 1 Mar 2021 09:47:47 -0800 Stephen Boyd wrote: > The %pS printk format (among some others) is used to print kernel > addresses symbolically. When the kernel prints an address inside of a > module, the kernel prints the addresses' symbol name along with the > module's name that contains the address. Let's make kernel stacktraces > easier to identify on KALLSYMS builds by including the build ID of a > module when we print the address. Please no! This kills the output of tracing with offset, and can possibly break scripts. I don't want to look at traces like this! -0 [004] ..s1 353.842577: ipv4_conntrack_in+0x0/0x10 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_hook_slow+0x40/0xb0 -0 [004] ..s1 353.842577: nf_conntrack_in+0x0/0x5c0 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_hook_slow+0x40/0xb0 -0 [004] ..s1 353.842577: get_l4proto+0x0/0x190 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x92/0x5c0 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) -0 [004] ..s1 353.842577: nf_ct_get_tuple+0x0/0x240 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0xec/0x5c0 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) -0 [004] ..s1 353.842577: hash_conntrack_raw+0x0/0x170 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x28c/0x5c0 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) -0 [004] ..s1 353.842578: __nf_conntrack_find_get.isra.0+0x0/0x2f0 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x29d/0x5c0 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) -0 [004] ..s1 353.842578: nf_conntrack_tcp_packet+0x0/0x1760 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x3c8/0x5c0 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) -0 [004] ..s2 353.842578: nf_ct_seq_offset+0x0/0x40 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_tcp_packet+0x26d/0x1760 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) -0 [004] ..s1 353.842578: __nf_ct_refresh_acct+0x0/0x50 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_tcp_packet+0x558/0x1760 [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) NACK! -- Steve