Re: [RFC PATCH 1/2] drm: RFC add choice to use dynamic debug in drm-debug

2020-12-17 Thread jim . cromie
On Fri, Dec 11, 2020 at 8:34 AM Ville Syrjälä
 wrote:
>
> On Thu, Dec 03, 2020 at 08:53:17PM -0700, Jim Cromie wrote:
> > drm's debug system uses distinct categories of debug messages, mapped
> > to bits in drm.debug.  Currently, code does a lot of unlikely bit-mask
> > checks on drm.debug (in drm_debug_enabled), we can use dynamic debug
> > instead, and get all that jump_label goodness.

> Is there an actual need to go through dyndbg and do all this stringy
> stuff, or would just eg. a static keys array for the debug categories
> get us the benefits of jump_label?
>

You certainly can strip the car, take the engine.
but you might need some of the drivetrain too.
maybe you want to skip the heated seats ?
dyndbg has some stuff you dont need, for sure.

for one, its heavy on data per callsite, with a static-key and
overhead for each.

But Id be wary that the jump-label code-patching is a slow path,
so trying to change hundreds of jump-sites with one static-key field
may run into problems with long lock hold times, etc.

There is a batching mechanism built-in to the jump-label stuff somewhere,
my impression is that it amortized system-wide syncs while being RT aware.


I've been working on trimming dyndbg down, at least the memory.
I'll be sending it out shortly, but heres a preview:


Subject: [RFC PATCH v2 0/7] dynamic debug diet plan

V2 is a rethought diet plan for dyndbg (I meant -v1 as rfc).

at highest level, patchset does:
1- move struct _ddebug "selector" fields to new struct _ddebug_callsite
2- make ddebug_callsites optional, good for some users
3- allow dropping callsites by those users.

1-v2. Rasmus noted that I shouldn't move format with the other fields,
and I realized that the "module:function:line" dynamic prefixes are
ultimately just log decorations, and are not needed for certain use
cases, including drm (with category -> prefix adaptation).

The drm use case:

- can benefit from jump-labels to avoid drm_debug_enabled()
- can map categories to format-prefixes: "drm:core:" "drm:kms:" etc
- can use dynamic_debug_exec_queries("format ^drm:core: +p", NULL)
- drm + amdgpu have ~3200 drm-debugs, drm + i915 have ~1600

If drm dropped optional site info, net 16 bytes saved / callsite, maybe more...

dropping optional info : module file func means loss of log "decorations"
and slimmer contents of control file.  uncategorized pr-debugs can be
avoided when dropping callsites.   Even with dropped info,
format, line, module queries can select individual sites precisely.

As of now, we still need the __dyndbg_callsites linker section; the
3-drop is just a forget-the-addy, not a kfree.

But compression is possible. v1 tried using zram, with mixed success.
v2 is a better foundation to re-try the zram.


Re: [RFC PATCH 1/2] drm: RFC add choice to use dynamic debug in drm-debug

2020-12-11 Thread Ville Syrjälä
On Thu, Dec 03, 2020 at 08:53:17PM -0700, Jim Cromie wrote:
> drm's debug system uses distinct categories of debug messages, mapped
> to bits in drm.debug.  Currently, code does a lot of unlikely bit-mask
> checks on drm.debug (in drm_debug_enabled), we can use dynamic debug
> instead, and get all that jump_label goodness.
> 
> RFC: dynamic debug has no concept of category, but we can do without
> one if we can prepend a class-prefix to each printk format.  Then we
> can use "format ^prefix" to select the whole category with one query.
> This is a log-facing and user visible change, but it seems unlikely to
> cause trouble for log watchers; they're not relying on the absence of
> class prefix strings.
> 
> This conversion yields ~2100 new callsites on my i7 laptop:
> 
>   dyndbg: 195 debug prints in module drm_kms_helper
>   dyndbg: 298 debug prints in module drm
>   dyndbg: 1630 debug prints in module i915
> 
> Since this change has wide-ranging effects (many drm drivers, with
> many callsites, and kernel image growth), and most vendors don't
> enable DYNAMIC_DEBUG, we supplement the existing mechanism, adding
> CONFIG_DRM_USE_DYNAMIC_DEBUG to enable the new one.
> 
> The indirection/switchover has a few parts:
> 
> 1 a new callback on drm.debug which calls dynamic_debug_exec_queries
>   to map those bits to specific query/commands
>   dynamic_debug_exec_queries("format ^drm:kms: +p", "drm*");
> 
> 2 a "converted" or "classy" DRM_UT_* map
>   similar to DRM_UT_* ( symbol => bit-mask )
>   named it  cDRM_UT_* ( symbol => format-class-prefix-string )
> 
>   cDRM_UT_* is either ( CONFIG_DRM_USE_DYNAMIC_DEBUG )
>   legacy: cDRM_UT_* <-- DRM_UT_*
>   enabled:
>   +#define cDRM_UT_KMS"drm:kms: "
>   +#define cDRM_UT_PRIME  "drm:prime: "
>   +#define cDRM_UT_ATOMIC "drm:atomic: "
> 
>   these are similar to "gvt: cmd:" in i915/gvt
>   and effectively a replacement for DRM_NAME
>   please bikeshed on keys, values. latter are log-facing.
> 
> 3 drm_dev_dbg & drm_debug are renamed (prefixed with '_')
>   old names are now macros, which are ifdefd
>   legacy:  -> to renamed fn
>   enabled: -> dev_dbg & pr_debug, after prepending prefix to format.
> 
> 4 names in (2) are called from DRM_DEBUG_ and drm_dbg_.
>   all these get "converted" to use cDRM_UT_*, to get right token type.
> 
> RFC: for dynamic debug, category is a source-facing addition;
> something like pr_debug_cat(cat, ...) would do it, iff cat is a
> compile-time const.  Note that cat isn't needed in the printing, it
> would be saved into a new field in struct _ddebug, and used only for
> callsite selection, activation and control.
> 
> Signed-off-by: Jim Cromie 
> ---
>  drivers/gpu/drm/Kconfig | 13 ++
>  drivers/gpu/drm/drm_print.c | 75 --
>  include/drm/drm_print.h | 92 +++--
>  3 files changed, 153 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 147d61b9674e..854bc1ad21fb 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -54,6 +54,19 @@ config DRM_DEBUG_MM
>  
> If in doubt, say "N".
>  
> +config DRM_USE_DYNAMIC_DEBUG
> + bool "use dynamic debug to implement drm.debug"
> + default n
> + depends on DRM
> + depends on DYNAMIC_DEBUG
> + depends on JUMP_LABEL
> + help
> +   The drm debug category facility does a lot of unlikely bit-field
> +   tests at runtime; while cheap individually, the cost accumulates.
> +   This option uses dynamic debug facility (if configured and
> +   using jump_label) to avoid those runtime checks, patching
> +   the kernel when those debugs are desired.
> +
>  config DRM_DEBUG_SELFTEST
>   tristate "kselftests for DRM"
>   depends on DRM
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 111b932cf2a9..e2acdfc7088b 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -52,8 +52,75 @@ MODULE_PARM_DESC(debug, "Enable debug output, where each 
> bit enables a debug cat
>  "\t\tBit 5 (0x20)  will enable VBL messages (vblank code)\n"
>  "\t\tBit 7 (0x80)  will enable LEASE messages (leasing code)\n"
>  "\t\tBit 8 (0x100) will enable DP messages (displayport code)");
> +
> +#ifndef CONFIG_DRM_USE_DYNAMIC_DEBUG
>  module_param_named(debug, __drm_debug, int, 0600);
>  
> +#else
> +static char *format_class_prefixes[] = {
> + cDRM_UT_CORE,
> + cDRM_UT_DRIVER,
> + cDRM_UT_KMS,
> + cDRM_UT_PRIME,
> + cDRM_UT_ATOMIC,
> + cDRM_UT_VBL,
> + cDRM_UT_STATE,
> + cDRM_UT_LEASE,
> + cDRM_UT_DP,
> + cDRM_UT_DRMRES
> +};
> +
> +#define OUR_QUERY_SIZE 64 /* > strlen "format '^%s' %cp" + longest prefix */
> +
> +static int param_set_dyndbg(const char *instr, const struct kernel_param *kp)
> +{
> + unsigned int val;
> + unsigned long changes, result;
> + int rc, chgct = 0, totct = 0, bitpos;
> + char 

Re: [RFC PATCH 1/2] drm: RFC add choice to use dynamic debug in drm-debug

2020-12-04 Thread jim . cromie
On Fri, Dec 4, 2020 at 8:42 AM Ville Syrjälä
 wrote:
>
> On Thu, Dec 03, 2020 at 08:53:17PM -0700, Jim Cromie wrote:
> > drm's debug system uses distinct categories of debug messages, mapped
> > to bits in drm.debug.  Currently, code does a lot of unlikely bit-mask
> > checks on drm.debug (in drm_debug_enabled), we can use dynamic debug
> > instead, and get all that jump_label goodness.
>
> whatis jump_label?

Sorry, I should have at least capitalized that, and spelled it differently

kernel/Makefile
118:obj-$(CONFIG_JUMP_LABEL) += jump_label.o

it is the hot-patching substrate underneath it all.
static-key, static-call, etc?
dynamic-debug uses static-key directly.



>
> One thing that bugs me about the current drm_dbg() stuff is that
> it's a function, and thus we pay the cost of setting up the
> arguments even when debugs are not enabled. I played around a bit
> with making it a macro again with the unlikely bit check moved
> into the macro. That did seem to make some of the asm a bit nicer
> where the debug stuff got shoved out the main codepath, but
> it did result in a slight net increase in code size. What I didn't
> have time to do is check if this has any measurable speed effect
> on eg. TEST_ONLY commits.
>
> And while doing that I started to ponder if we could use something
> like the alternate() instruction stuff to patch the code at runtime
> in order to turn all those debug checks into nops when debugging
> is not enabled. But I couldn't immediately find any generic
> infrastructure for it. So now I wonder if this jump_label is something
> like that?
>

this is the droid youre looking for ;-)

> >

> --
> Ville Syrjälä
> Intel


Re: [RFC PATCH 1/2] drm: RFC add choice to use dynamic debug in drm-debug

2020-12-04 Thread Ville Syrjälä
On Thu, Dec 03, 2020 at 08:53:17PM -0700, Jim Cromie wrote:
> drm's debug system uses distinct categories of debug messages, mapped
> to bits in drm.debug.  Currently, code does a lot of unlikely bit-mask
> checks on drm.debug (in drm_debug_enabled), we can use dynamic debug
> instead, and get all that jump_label goodness.

whatis jump_label?

One thing that bugs me about the current drm_dbg() stuff is that
it's a function, and thus we pay the cost of setting up the
arguments even when debugs are not enabled. I played around a bit
with making it a macro again with the unlikely bit check moved
into the macro. That did seem to make some of the asm a bit nicer
where the debug stuff got shoved out the main codepath, but
it did result in a slight net increase in code size. What I didn't
have time to do is check if this has any measurable speed effect
on eg. TEST_ONLY commits.

And while doing that I started to ponder if we could use something
like the alternate() instruction stuff to patch the code at runtime
in order to turn all those debug checks into nops when debugging
is not enabled. But I couldn't immediately find any generic
infrastructure for it. So now I wonder if this jump_label is something
like that?

> 
> RFC: dynamic debug has no concept of category, but we can do without
> one if we can prepend a class-prefix to each printk format.  Then we
> can use "format ^prefix" to select the whole category with one query.
> This is a log-facing and user visible change, but it seems unlikely to
> cause trouble for log watchers; they're not relying on the absence of
> class prefix strings.
> 
> This conversion yields ~2100 new callsites on my i7 laptop:
> 
>   dyndbg: 195 debug prints in module drm_kms_helper
>   dyndbg: 298 debug prints in module drm
>   dyndbg: 1630 debug prints in module i915
> 
> Since this change has wide-ranging effects (many drm drivers, with
> many callsites, and kernel image growth), and most vendors don't
> enable DYNAMIC_DEBUG, we supplement the existing mechanism, adding
> CONFIG_DRM_USE_DYNAMIC_DEBUG to enable the new one.
> 
> The indirection/switchover has a few parts:
> 
> 1 a new callback on drm.debug which calls dynamic_debug_exec_queries
>   to map those bits to specific query/commands
>   dynamic_debug_exec_queries("format ^drm:kms: +p", "drm*");
> 
> 2 a "converted" or "classy" DRM_UT_* map
>   similar to DRM_UT_* ( symbol => bit-mask )
>   named it  cDRM_UT_* ( symbol => format-class-prefix-string )
> 
>   cDRM_UT_* is either ( CONFIG_DRM_USE_DYNAMIC_DEBUG )
>   legacy: cDRM_UT_* <-- DRM_UT_*
>   enabled:
>   +#define cDRM_UT_KMS"drm:kms: "
>   +#define cDRM_UT_PRIME  "drm:prime: "
>   +#define cDRM_UT_ATOMIC "drm:atomic: "
> 
>   these are similar to "gvt: cmd:" in i915/gvt
>   and effectively a replacement for DRM_NAME
>   please bikeshed on keys, values. latter are log-facing.
> 
> 3 drm_dev_dbg & drm_debug are renamed (prefixed with '_')
>   old names are now macros, which are ifdefd
>   legacy:  -> to renamed fn
>   enabled: -> dev_dbg & pr_debug, after prepending prefix to format.
> 
> 4 names in (2) are called from DRM_DEBUG_ and drm_dbg_.
>   all these get "converted" to use cDRM_UT_*, to get right token type.
> 
> RFC: for dynamic debug, category is a source-facing addition;
> something like pr_debug_cat(cat, ...) would do it, iff cat is a
> compile-time const.  Note that cat isn't needed in the printing, it
> would be saved into a new field in struct _ddebug, and used only for
> callsite selection, activation and control.
> 
> Signed-off-by: Jim Cromie 
> ---
>  drivers/gpu/drm/Kconfig | 13 ++
>  drivers/gpu/drm/drm_print.c | 75 --
>  include/drm/drm_print.h | 92 +++--
>  3 files changed, 153 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 147d61b9674e..854bc1ad21fb 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -54,6 +54,19 @@ config DRM_DEBUG_MM
>  
> If in doubt, say "N".
>  
> +config DRM_USE_DYNAMIC_DEBUG
> + bool "use dynamic debug to implement drm.debug"
> + default n
> + depends on DRM
> + depends on DYNAMIC_DEBUG
> + depends on JUMP_LABEL
> + help
> +   The drm debug category facility does a lot of unlikely bit-field
> +   tests at runtime; while cheap individually, the cost accumulates.
> +   This option uses dynamic debug facility (if configured and
> +   using jump_label) to avoid those runtime checks, patching
> +   the kernel when those debugs are desired.
> +
>  config DRM_DEBUG_SELFTEST
>   tristate "kselftests for DRM"
>   depends on DRM
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 111b932cf2a9..e2acdfc7088b 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -52,8 +52,75 @@ MODULE_PARM_DESC(debug, "Enable debug output, where each 
> bit enables a debug cat

[RFC PATCH 1/2] drm: RFC add choice to use dynamic debug in drm-debug

2020-12-03 Thread Jim Cromie
drm's debug system uses distinct categories of debug messages, mapped
to bits in drm.debug.  Currently, code does a lot of unlikely bit-mask
checks on drm.debug (in drm_debug_enabled), we can use dynamic debug
instead, and get all that jump_label goodness.

RFC: dynamic debug has no concept of category, but we can do without
one if we can prepend a class-prefix to each printk format.  Then we
can use "format ^prefix" to select the whole category with one query.
This is a log-facing and user visible change, but it seems unlikely to
cause trouble for log watchers; they're not relying on the absence of
class prefix strings.

This conversion yields ~2100 new callsites on my i7 laptop:

  dyndbg: 195 debug prints in module drm_kms_helper
  dyndbg: 298 debug prints in module drm
  dyndbg: 1630 debug prints in module i915

Since this change has wide-ranging effects (many drm drivers, with
many callsites, and kernel image growth), and most vendors don't
enable DYNAMIC_DEBUG, we supplement the existing mechanism, adding
CONFIG_DRM_USE_DYNAMIC_DEBUG to enable the new one.

The indirection/switchover has a few parts:

1 a new callback on drm.debug which calls dynamic_debug_exec_queries
  to map those bits to specific query/commands
  dynamic_debug_exec_queries("format ^drm:kms: +p", "drm*");

2 a "converted" or "classy" DRM_UT_* map
  similar to DRM_UT_* ( symbol => bit-mask )
  named it  cDRM_UT_* ( symbol => format-class-prefix-string )

  cDRM_UT_* is either ( CONFIG_DRM_USE_DYNAMIC_DEBUG )
  legacy: cDRM_UT_* <-- DRM_UT_*
  enabled:
  +#define cDRM_UT_KMS"drm:kms: "
  +#define cDRM_UT_PRIME  "drm:prime: "
  +#define cDRM_UT_ATOMIC "drm:atomic: "

  these are similar to "gvt: cmd:" in i915/gvt
  and effectively a replacement for DRM_NAME
  please bikeshed on keys, values. latter are log-facing.

3 drm_dev_dbg & drm_debug are renamed (prefixed with '_')
  old names are now macros, which are ifdefd
  legacy:  -> to renamed fn
  enabled: -> dev_dbg & pr_debug, after prepending prefix to format.

4 names in (2) are called from DRM_DEBUG_ and drm_dbg_.
  all these get "converted" to use cDRM_UT_*, to get right token type.

RFC: for dynamic debug, category is a source-facing addition;
something like pr_debug_cat(cat, ...) would do it, iff cat is a
compile-time const.  Note that cat isn't needed in the printing, it
would be saved into a new field in struct _ddebug, and used only for
callsite selection, activation and control.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/Kconfig | 13 ++
 drivers/gpu/drm/drm_print.c | 75 --
 include/drm/drm_print.h | 92 +++--
 3 files changed, 153 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 147d61b9674e..854bc1ad21fb 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -54,6 +54,19 @@ config DRM_DEBUG_MM
 
  If in doubt, say "N".
 
+config DRM_USE_DYNAMIC_DEBUG
+   bool "use dynamic debug to implement drm.debug"
+   default n
+   depends on DRM
+   depends on DYNAMIC_DEBUG
+   depends on JUMP_LABEL
+   help
+ The drm debug category facility does a lot of unlikely bit-field
+ tests at runtime; while cheap individually, the cost accumulates.
+ This option uses dynamic debug facility (if configured and
+ using jump_label) to avoid those runtime checks, patching
+ the kernel when those debugs are desired.
+
 config DRM_DEBUG_SELFTEST
tristate "kselftests for DRM"
depends on DRM
diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index 111b932cf2a9..e2acdfc7088b 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -52,8 +52,75 @@ MODULE_PARM_DESC(debug, "Enable debug output, where each bit 
enables a debug cat
 "\t\tBit 5 (0x20)  will enable VBL messages (vblank code)\n"
 "\t\tBit 7 (0x80)  will enable LEASE messages (leasing code)\n"
 "\t\tBit 8 (0x100) will enable DP messages (displayport code)");
+
+#ifndef CONFIG_DRM_USE_DYNAMIC_DEBUG
 module_param_named(debug, __drm_debug, int, 0600);
 
+#else
+static char *format_class_prefixes[] = {
+   cDRM_UT_CORE,
+   cDRM_UT_DRIVER,
+   cDRM_UT_KMS,
+   cDRM_UT_PRIME,
+   cDRM_UT_ATOMIC,
+   cDRM_UT_VBL,
+   cDRM_UT_STATE,
+   cDRM_UT_LEASE,
+   cDRM_UT_DP,
+   cDRM_UT_DRMRES
+};
+
+#define OUR_QUERY_SIZE 64 /* > strlen "format '^%s' %cp" + longest prefix */
+
+static int param_set_dyndbg(const char *instr, const struct kernel_param *kp)
+{
+   unsigned int val;
+   unsigned long changes, result;
+   int rc, chgct = 0, totct = 0, bitpos;
+   char query[OUR_QUERY_SIZE];
+
+   rc = kstrtouint(instr, 0, );
+   if (rc) {
+   pr_err("%s: failed\n", __func__);
+   return -EINVAL;
+   }
+   result = val;
+   changes = result ^ __drm_debug;
+
+   pr_debug("changes:0x%lx from