Re: [PATCH v4 04/13] dump_stack: Add vmlinux build ID to stack traces

2021-04-13 Thread Stephen Boyd
Quoting Petr Mladek (2021-04-13 07:41:11)
> > diff --git a/lib/dump_stack.c b/lib/dump_stack.c
> > index f5a33b6f773f..d685331b065f 100644
> > --- a/lib/dump_stack.c
> > +++ b/lib/dump_stack.c
> > @@ -5,6 +5,7 @@
> >   */
> >  
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -36,6 +37,14 @@ void __init dump_stack_set_arch_desc(const char *fmt, 
> > ...)
> >   va_end(args);
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
> > +#define BUILD_ID_FMT " %20phN"
> > +#define BUILD_ID_VAL vmlinux_build_id
> > +#else
> > +#define BUILD_ID_FMT "%s"
> > +#define BUILD_ID_VAL ""
> > +#endif
> 
> 3rd patch always defines and initializes vmlinux_build_id. But it is
> used only when CONFIG_STACKTRACE_BUILD_ID is enabled.

It is also used for crash code.

> Is it intentional, please?

Yes, mostly for simplicity with the other user.

> 
> It is not a big deal for vmlinux_build_id. But it is more questionable
> for the per-module id. I am going to open this question for 5th patch
> as well.
> 

Right, for the vmlinux_build_id symbol it is not exported, and the whole
buildid.c file is part of lib-y, so if the symbol isn't used the linker
should drop it during link phase. I can drop the early init call if the
config is disabled and crash kernel code isn't enabled, and then rely on
the linker to drop the vmlinux_build_id symbol. Let me see if that can
work so that we don't have to parse it at boot if it is never used.


Re: [PATCH v4 04/13] dump_stack: Add vmlinux build ID to stack traces

2021-04-13 Thread Petr Mladek
On Fri 2021-04-09 18:52:51, Stephen Boyd wrote:
> Add the running kernel's build ID[1] to the stacktrace information
> header.  This makes it simpler for developers to locate the vmlinux with
> full debuginfo for a particular kernel stacktrace. Combined with
> scripts/decode_stracktrace.sh, a developer can download the correct
> vmlinux from a debuginfod[2] server and find the exact file and line
> number for the functions plus offsets in a stacktrace.
> 
> This is especially useful for pstore crash debugging where the kernel
> crashes are recorded in the pstore logs and the recovery kernel is
> different or the debuginfo doesn't exist on the device due to space
> concerns (the data can be large and a security concern). The stacktrace
> can be analyzed after the crash by using the build ID to find the
> matching vmlinux and understand where in the function something went
> wrong.
> 
> Example stacktrace from lkdtm:
> 
>  WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 
> lkdtm_WARNING+0x28/0x30 [lkdtm]
>  Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg xt_cgroup 
> uinput xt_MASQUERADE
>  CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 
> aa23f7a1231c229de205662d5a9e0d4c580f19a1
>  Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
>  pstate: 0049 (nzcv daif +PAN -UAO -TCO BTYPE=--)
>  pc : lkdtm_WARNING+0x28/0x30 [lkdtm]
> 
> The hex string aa23f7a1231c229de205662d5a9e0d4c580f19a1 is the build ID,
> following the kernel version number. Put it all behind a config option,
> STACKTRACE_BUILD_ID, so that kernel developers can remove this
> information if they decide it is too much.
> 
> Cc: Jiri Olsa 
> Cc: Alexei Starovoitov 
> Cc: Jessica Yu 
> Cc: Evan Green 
> Cc: Hsin-Yi Wang 
> Cc: Petr Mladek 
> Cc: Steven Rostedt 
> Cc: Andy Shevchenko 
> Cc: Matthew Wilcox 
> Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
> Link: https://sourceware.org/elfutils/Debuginfod.html [2]
> Signed-off-by: Stephen Boyd 

Reviewed-by: Petr Mladek 
Tested-by: Petr Mladek 

One comment below.

> ---
>  lib/Kconfig.debug | 11 +++
>  lib/dump_stack.c  | 13 +++--
>  2 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 2779c29d9981..5f883e50f406 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -35,6 +35,17 @@ config PRINTK_CALLER
> no option to enable/disable at the kernel command line parameter or
> sysfs interface.
>  
> +config STACKTRACE_BUILD_ID
> + bool "Show build ID information in stacktraces"
> + depends on PRINTK
> + help
> +   Selecting this option adds build ID information for symbols in
> +   stacktraces printed with the printk format '%p[SR]b'.
> +
> +   This option is intended for distros where debuginfo is not easily
> +   accessible but can be downloaded given the build ID of the vmlinux or
> +   kernel module where the function is located.
> +
>  config CONSOLE_LOGLEVEL_DEFAULT
>   int "Default console loglevel (1-15)"
>   range 1 15
> diff --git a/lib/dump_stack.c b/lib/dump_stack.c
> index f5a33b6f773f..d685331b065f 100644
> --- a/lib/dump_stack.c
> +++ b/lib/dump_stack.c
> @@ -5,6 +5,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -36,6 +37,14 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
>   va_end(args);
>  }
>  
> +#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
> +#define BUILD_ID_FMT " %20phN"
> +#define BUILD_ID_VAL vmlinux_build_id
> +#else
> +#define BUILD_ID_FMT "%s"
> +#define BUILD_ID_VAL ""
> +#endif

3rd patch always defines and initializes vmlinux_build_id. But it is
used only when CONFIG_STACKTRACE_BUILD_ID is enabled.
Is it intentional, please?

It is not a big deal for vmlinux_build_id. But it is more questionable
for the per-module id. I am going to open this question for 5th patch
as well.

Best Regards,
Petr


[PATCH v4 04/13] dump_stack: Add vmlinux build ID to stack traces

2021-04-09 Thread Stephen Boyd
Add the running kernel's build ID[1] to the stacktrace information
header.  This makes it simpler for developers to locate the vmlinux with
full debuginfo for a particular kernel stacktrace. Combined with
scripts/decode_stracktrace.sh, a developer can download the correct
vmlinux from a debuginfod[2] server and find the exact file and line
number for the functions plus offsets in a stacktrace.

This is especially useful for pstore crash debugging where the kernel
crashes are recorded in the pstore logs and the recovery kernel is
different or the debuginfo doesn't exist on the device due to space
concerns (the data can be large and a security concern). The stacktrace
can be analyzed after the crash by using the build ID to find the
matching vmlinux and understand where in the function something went
wrong.

Example stacktrace from lkdtm:

 WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 
lkdtm_WARNING+0x28/0x30 [lkdtm]
 Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg xt_cgroup 
uinput xt_MASQUERADE
 CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 
aa23f7a1231c229de205662d5a9e0d4c580f19a1
 Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
 pstate: 0049 (nzcv daif +PAN -UAO -TCO BTYPE=--)
 pc : lkdtm_WARNING+0x28/0x30 [lkdtm]

The hex string aa23f7a1231c229de205662d5a9e0d4c580f19a1 is the build ID,
following the kernel version number. Put it all behind a config option,
STACKTRACE_BUILD_ID, so that kernel developers can remove this
information if they decide it is too much.

Cc: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Jessica Yu 
Cc: Evan Green 
Cc: Hsin-Yi Wang 
Cc: Petr Mladek 
Cc: Steven Rostedt 
Cc: Andy Shevchenko 
Cc: Matthew Wilcox 
Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
Link: https://sourceware.org/elfutils/Debuginfod.html [2]
Signed-off-by: Stephen Boyd 
---
 lib/Kconfig.debug | 11 +++
 lib/dump_stack.c  | 13 +++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2779c29d9981..5f883e50f406 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -35,6 +35,17 @@ config PRINTK_CALLER
  no option to enable/disable at the kernel command line parameter or
  sysfs interface.
 
+config STACKTRACE_BUILD_ID
+   bool "Show build ID information in stacktraces"
+   depends on PRINTK
+   help
+ Selecting this option adds build ID information for symbols in
+ stacktraces printed with the printk format '%p[SR]b'.
+
+ This option is intended for distros where debuginfo is not easily
+ accessible but can be downloaded given the build ID of the vmlinux or
+ kernel module where the function is located.
+
 config CONSOLE_LOGLEVEL_DEFAULT
int "Default console loglevel (1-15)"
range 1 15
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index f5a33b6f773f..d685331b065f 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -5,6 +5,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -36,6 +37,14 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
va_end(args);
 }
 
+#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
+#define BUILD_ID_FMT " %20phN"
+#define BUILD_ID_VAL vmlinux_build_id
+#else
+#define BUILD_ID_FMT "%s"
+#define BUILD_ID_VAL ""
+#endif
+
 /**
  * dump_stack_print_info - print generic debug info for dump_stack()
  * @log_lvl: log level
@@ -45,13 +54,13 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
  */
 void dump_stack_print_info(const char *log_lvl)
 {
-   printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s\n",
+   printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s" BUILD_ID_FMT "\n",
   log_lvl, raw_smp_processor_id(), current->pid, current->comm,
   kexec_crash_loaded() ? "Kdump: loaded " : "",
   print_tainted(),
   init_utsname()->release,
   (int)strcspn(init_utsname()->version, " "),
-  init_utsname()->version);
+  init_utsname()->version, BUILD_ID_VAL);
 
if (dump_stack_arch_desc_str[0] != '\0')
printk("%sHardware name: %s\n",
-- 
https://chromeos.dev