Re: [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression

2023-02-03 Thread Stanislaw Gruszka
On Fri, Feb 03, 2023 at 11:34:02AM +0200, Jani Nikula wrote:
> On Wed, 25 Jan 2023, Jim Cromie  wrote:
> > Hi everyone,
> >
> > In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in
> > drivers at modprobe.
> 
> I realize we haven't actually addressed the regression in any way yet,
> and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have
> DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with
> trying to gather logs from users on v6.1 or later. It hampers debugging
> pretty badly.
> 
> I appreciate the effort in fixing the problem properly here, but we'll
> need a fix that we can backport to stable kernels.
> 
> Maybe just Ville's idea of
> 
>  config DRM_USE_DYNAMIC_DEBUG
> bool "use dynamic debug to implement drm.debug"
> -   default y
> +   default n
> +   depends on BROKEN
> depends on DRM
> depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE
> 
> but we'll need that as a patch and merged and backported ASAP.

+1 for this

Regards
Stanislaw


Re: [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression

2023-02-03 Thread Jani Nikula
On Wed, 25 Jan 2023, Jim Cromie  wrote:
> Hi everyone,
>
> In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in
> drivers at modprobe.

I realize we haven't actually addressed the regression in any way yet,
and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have
DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with
trying to gather logs from users on v6.1 or later. It hampers debugging
pretty badly.

I appreciate the effort in fixing the problem properly here, but we'll
need a fix that we can backport to stable kernels.

Maybe just Ville's idea of

 config DRM_USE_DYNAMIC_DEBUG
bool "use dynamic debug to implement drm.debug"
-   default y
+   default n
+   depends on BROKEN
depends on DRM
depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE

but we'll need that as a patch and merged and backported ASAP.

In the mean time, is there a workaround that the user could enable, say,
on the kernel command line, to enable drm debugs on driver kernel
modules, all the way from boot?


BR,
Jani.




>
> It is due to a chicken-egg problem loading modules; on `modprobe
> i915`, drm is loaded 1st, and drm/parameters/debug is set.  When
> drm_debug_enabled() tested __drm_debug at runtime, this just worked.
>
> But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
> static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's
> kparam callback on __drm_debug.  So with drm.ko loaded and initialized
> before the dependent modules, their debug callsites aren't yet present
> to be enabled.
>
> STATUS - v3
>
> not quite ready.
> rebased on -rc5, hopefully applies to patchwork head 
> still has RFC patch -> CI_ONLY temporary, to avoid panics
> boots on my amdgpu box, drm.debug=0x3ff works at boot-time
> the "toggled" warning is repeatable with test_dynamic_debug*.ko
> it also occurs on amdgpu, so not just artificial.
> v2 is 
> https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cro...@gmail.com/
>
> OVERVIEW
>
> As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is
> error-prone enough to call broken: sharing of a common classmap
> required identical classmap definitions in all modules using DRM_UT_*,
> which is inherently error-prone.  IOW, it muddled the K distinction
> between a (single) definition, and multiple references.
>
> So patches 10-13 split it into:
>
> DYNDBG_CLASSMAP_DEFINEused once per subsystem to define each classmap.
> DYNDBG_CLASSMAP_USE   declare dependence on a DEFINEd classmap.
>
> DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the
> (existing) __dyndbg_classes section, and exports the struct var
> (unlike DECLARE_DYNDBG_CLASSMAP).
>
> DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the
> user-module-name, and a ref to the exported classmap var.
>
> The distinction allows separate treatment of classmaps and
> classmap-refs, the latter getting additional behavior to propagate
> parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) 
>
> . lookup the classmap defn being referenced, and its module
> . find the module's kernel-params using the classmap
> . propagate kparam vals into the prdgs in module being added.
>
> It also makes the weird coordinated-changes-by-identical-classmaps
> "feature" unnecessary.
>
> Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses.
>
> Patch-11 is the core of it; the separate treatment begins in
> ddebug_add_module().  It calls ddebug_attach_module_classes(1) to
> handle class-defns; this adds ddebug_attach_client_module_classes(2)
> to handle class-refs, as they are found while modprobing drm
> drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's
> referred classmap definition.
>
> (3) scans kernel-params owned by the module DEFINEing the classmap,
> either builtin or loadable, calls ddebug_match_apply_kparam(4) on each.
>
> (4) looks for kparams which are wired to dyndbg's param-ops.  Those
> params have a struct ddebug_class_param attached, which has a classmap
> and a ref to a state-var (__drm_debug for DRM case).  If the kparam's
> classmap is the same as from (2), then apply its state-var to the
> client module by calling ddebug_apply_class_bitmap().
>
> Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args.
>
> Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_*
> symbols directly, not "DRM_UT_*" (their strings).  It adds new
> include/linux/map.h to support this.
>
> Patches 1-9 are prep, refactor, cleanup, tighten interfaces
>
> Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module
> regression; it builds both test_dynamic_debug.ko and _submod.ko, with
> an ifdef to _DEFINE in the main module, and _USE in the submod.  This
> gives both modules identical set of prdbgs, which is helpful for
> comparing results.
>
> here it is, working properly:
>
> doing class DRM_UT_CORE -p
> [ 9904.961750] dyndbg: read 21 bytes from userspace
> [ 

[PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression

2023-01-25 Thread Jim Cromie
Hi everyone,

In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in
drivers at modprobe.

It is due to a chicken-egg problem loading modules; on `modprobe
i915`, drm is loaded 1st, and drm/parameters/debug is set.  When
drm_debug_enabled() tested __drm_debug at runtime, this just worked.

But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's
kparam callback on __drm_debug.  So with drm.ko loaded and initialized
before the dependent modules, their debug callsites aren't yet present
to be enabled.

STATUS - v3

not quite ready.
rebased on -rc5, hopefully applies to patchwork head 
still has RFC patch -> CI_ONLY temporary, to avoid panics
boots on my amdgpu box, drm.debug=0x3ff works at boot-time
the "toggled" warning is repeatable with test_dynamic_debug*.ko
it also occurs on amdgpu, so not just artificial.
v2 is https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cro...@gmail.com/

OVERVIEW

As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is
error-prone enough to call broken: sharing of a common classmap
required identical classmap definitions in all modules using DRM_UT_*,
which is inherently error-prone.  IOW, it muddled the K distinction
between a (single) definition, and multiple references.

So patches 10-13 split it into:

DYNDBG_CLASSMAP_DEFINE  used once per subsystem to define each classmap.
DYNDBG_CLASSMAP_USE declare dependence on a DEFINEd classmap.

DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the
(existing) __dyndbg_classes section, and exports the struct var
(unlike DECLARE_DYNDBG_CLASSMAP).

DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the
user-module-name, and a ref to the exported classmap var.

The distinction allows separate treatment of classmaps and
classmap-refs, the latter getting additional behavior to propagate
parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) 

. lookup the classmap defn being referenced, and its module
. find the module's kernel-params using the classmap
. propagate kparam vals into the prdgs in module being added.

It also makes the weird coordinated-changes-by-identical-classmaps
"feature" unnecessary.

Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses.

Patch-11 is the core of it; the separate treatment begins in
ddebug_add_module().  It calls ddebug_attach_module_classes(1) to
handle class-defns; this adds ddebug_attach_client_module_classes(2)
to handle class-refs, as they are found while modprobing drm
drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's
referred classmap definition.

(3) scans kernel-params owned by the module DEFINEing the classmap,
either builtin or loadable, calls ddebug_match_apply_kparam(4) on each.

(4) looks for kparams which are wired to dyndbg's param-ops.  Those
params have a struct ddebug_class_param attached, which has a classmap
and a ref to a state-var (__drm_debug for DRM case).  If the kparam's
classmap is the same as from (2), then apply its state-var to the
client module by calling ddebug_apply_class_bitmap().

Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args.

Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_*
symbols directly, not "DRM_UT_*" (their strings).  It adds new
include/linux/map.h to support this.

Patches 1-9 are prep, refactor, cleanup, tighten interfaces

Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module
regression; it builds both test_dynamic_debug.ko and _submod.ko, with
an ifdef to _DEFINE in the main module, and _USE in the submod.  This
gives both modules identical set of prdbgs, which is helpful for
comparing results.

here it is, working properly:

doing class DRM_UT_CORE -p
[ 9904.961750] dyndbg: read 21 bytes from userspace
[ 9904.962286] dyndbg: query 0: "class DRM_UT_CORE -p" mod:*
[ 9904.962848] dyndbg: split into words: "class" "DRM_UT_CORE" "-p"
[ 9904.963444] dyndbg: op='-' flags=0x0 maskp=0xfffe
[ 9904.963945] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 
class=DRM_UT_CORE
[ 9904.964781] dyndbg: good-class: drm.DRM_UT_CORE  module:drm nd:302 nc:1 nu:0
[ 9904.966411] dyndbg: class-ref: drm_kms_helper.DRM_UT_CORE  
module:drm_kms_helper nd:95 nc:0 nu:1
[ 9904.967265] dyndbg: class-ref: drm_display_helper.DRM_UT_CORE  
module:drm_display_helper nd:150 nc:0 nu:1
[ 9904.968349] dyndbg: class-ref: i915.DRM_UT_CORE  module:i915 nd:1659 nc:0 
nu:1
[ 9904.969801] dyndbg: class-ref: amdgpu.DRM_UT_CORE  module:amdgpu nd:4425 
nc:0 nu:1
[ 9904.977079] dyndbg: class-ref: nouveau.DRM_UT_CORE  module:nouveau nd:103 
nc:0 nu:1
[ 9904.977830] dyndbg: processed 1 queries, with 507 matches, 0 errs
doing class DRM_UT_DRIVER +p
[ 9906.151761] dyndbg: read 23 bytes from userspace
[ 9906.152241] dyndbg: query 0: "class DRM_UT_DRIVER +p" mod:*
[ 9906.152793] dyndbg: split into words: "class" "DRM_UT_DRIVER" "+p"
[ 9906.153388] dyndbg: