Re: [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression
On Fri, Feb 03, 2023 at 11:34:02AM +0200, Jani Nikula wrote: > On Wed, 25 Jan 2023, Jim Cromie wrote: > > Hi everyone, > > > > In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in > > drivers at modprobe. > > I realize we haven't actually addressed the regression in any way yet, > and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have > DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with > trying to gather logs from users on v6.1 or later. It hampers debugging > pretty badly. > > I appreciate the effort in fixing the problem properly here, but we'll > need a fix that we can backport to stable kernels. > > Maybe just Ville's idea of > > config DRM_USE_DYNAMIC_DEBUG > bool "use dynamic debug to implement drm.debug" > - default y > + default n > + depends on BROKEN > depends on DRM > depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE > > but we'll need that as a patch and merged and backported ASAP. +1 for this Regards Stanislaw
Re: [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression
On Wed, 25 Jan 2023, Jim Cromie wrote: > Hi everyone, > > In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in > drivers at modprobe. I realize we haven't actually addressed the regression in any way yet, and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with trying to gather logs from users on v6.1 or later. It hampers debugging pretty badly. I appreciate the effort in fixing the problem properly here, but we'll need a fix that we can backport to stable kernels. Maybe just Ville's idea of config DRM_USE_DYNAMIC_DEBUG bool "use dynamic debug to implement drm.debug" - default y + default n + depends on BROKEN depends on DRM depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE but we'll need that as a patch and merged and backported ASAP. In the mean time, is there a workaround that the user could enable, say, on the kernel command line, to enable drm debugs on driver kernel modules, all the way from boot? BR, Jani. > > It is due to a chicken-egg problem loading modules; on `modprobe > i915`, drm is loaded 1st, and drm/parameters/debug is set. When > drm_debug_enabled() tested __drm_debug at runtime, this just worked. > > But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a > static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's > kparam callback on __drm_debug. So with drm.ko loaded and initialized > before the dependent modules, their debug callsites aren't yet present > to be enabled. > > STATUS - v3 > > not quite ready. > rebased on -rc5, hopefully applies to patchwork head > still has RFC patch -> CI_ONLY temporary, to avoid panics > boots on my amdgpu box, drm.debug=0x3ff works at boot-time > the "toggled" warning is repeatable with test_dynamic_debug*.ko > it also occurs on amdgpu, so not just artificial. > v2 is > https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cro...@gmail.com/ > > OVERVIEW > > As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is > error-prone enough to call broken: sharing of a common classmap > required identical classmap definitions in all modules using DRM_UT_*, > which is inherently error-prone. IOW, it muddled the K distinction > between a (single) definition, and multiple references. > > So patches 10-13 split it into: > > DYNDBG_CLASSMAP_DEFINEused once per subsystem to define each classmap. > DYNDBG_CLASSMAP_USE declare dependence on a DEFINEd classmap. > > DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the > (existing) __dyndbg_classes section, and exports the struct var > (unlike DECLARE_DYNDBG_CLASSMAP). > > DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the > user-module-name, and a ref to the exported classmap var. > > The distinction allows separate treatment of classmaps and > classmap-refs, the latter getting additional behavior to propagate > parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) > > . lookup the classmap defn being referenced, and its module > . find the module's kernel-params using the classmap > . propagate kparam vals into the prdgs in module being added. > > It also makes the weird coordinated-changes-by-identical-classmaps > "feature" unnecessary. > > Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses. > > Patch-11 is the core of it; the separate treatment begins in > ddebug_add_module(). It calls ddebug_attach_module_classes(1) to > handle class-defns; this adds ddebug_attach_client_module_classes(2) > to handle class-refs, as they are found while modprobing drm > drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's > referred classmap definition. > > (3) scans kernel-params owned by the module DEFINEing the classmap, > either builtin or loadable, calls ddebug_match_apply_kparam(4) on each. > > (4) looks for kparams which are wired to dyndbg's param-ops. Those > params have a struct ddebug_class_param attached, which has a classmap > and a ref to a state-var (__drm_debug for DRM case). If the kparam's > classmap is the same as from (2), then apply its state-var to the > client module by calling ddebug_apply_class_bitmap(). > > Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args. > > Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_* > symbols directly, not "DRM_UT_*" (their strings). It adds new > include/linux/map.h to support this. > > Patches 1-9 are prep, refactor, cleanup, tighten interfaces > > Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module > regression; it builds both test_dynamic_debug.ko and _submod.ko, with > an ifdef to _DEFINE in the main module, and _USE in the submod. This > gives both modules identical set of prdbgs, which is helpful for > comparing results. > > here it is, working properly: > > doing class DRM_UT_CORE -p > [ 9904.961750] dyndbg: read 21 bytes from userspace > [
[PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression
Hi everyone, In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in drivers at modprobe. It is due to a chicken-egg problem loading modules; on `modprobe i915`, drm is loaded 1st, and drm/parameters/debug is set. When drm_debug_enabled() tested __drm_debug at runtime, this just worked. But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's kparam callback on __drm_debug. So with drm.ko loaded and initialized before the dependent modules, their debug callsites aren't yet present to be enabled. STATUS - v3 not quite ready. rebased on -rc5, hopefully applies to patchwork head still has RFC patch -> CI_ONLY temporary, to avoid panics boots on my amdgpu box, drm.debug=0x3ff works at boot-time the "toggled" warning is repeatable with test_dynamic_debug*.ko it also occurs on amdgpu, so not just artificial. v2 is https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cro...@gmail.com/ OVERVIEW As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is error-prone enough to call broken: sharing of a common classmap required identical classmap definitions in all modules using DRM_UT_*, which is inherently error-prone. IOW, it muddled the K distinction between a (single) definition, and multiple references. So patches 10-13 split it into: DYNDBG_CLASSMAP_DEFINE used once per subsystem to define each classmap. DYNDBG_CLASSMAP_USE declare dependence on a DEFINEd classmap. DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the (existing) __dyndbg_classes section, and exports the struct var (unlike DECLARE_DYNDBG_CLASSMAP). DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the user-module-name, and a ref to the exported classmap var. The distinction allows separate treatment of classmaps and classmap-refs, the latter getting additional behavior to propagate parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) . lookup the classmap defn being referenced, and its module . find the module's kernel-params using the classmap . propagate kparam vals into the prdgs in module being added. It also makes the weird coordinated-changes-by-identical-classmaps "feature" unnecessary. Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses. Patch-11 is the core of it; the separate treatment begins in ddebug_add_module(). It calls ddebug_attach_module_classes(1) to handle class-defns; this adds ddebug_attach_client_module_classes(2) to handle class-refs, as they are found while modprobing drm drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's referred classmap definition. (3) scans kernel-params owned by the module DEFINEing the classmap, either builtin or loadable, calls ddebug_match_apply_kparam(4) on each. (4) looks for kparams which are wired to dyndbg's param-ops. Those params have a struct ddebug_class_param attached, which has a classmap and a ref to a state-var (__drm_debug for DRM case). If the kparam's classmap is the same as from (2), then apply its state-var to the client module by calling ddebug_apply_class_bitmap(). Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args. Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_* symbols directly, not "DRM_UT_*" (their strings). It adds new include/linux/map.h to support this. Patches 1-9 are prep, refactor, cleanup, tighten interfaces Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module regression; it builds both test_dynamic_debug.ko and _submod.ko, with an ifdef to _DEFINE in the main module, and _USE in the submod. This gives both modules identical set of prdbgs, which is helpful for comparing results. here it is, working properly: doing class DRM_UT_CORE -p [ 9904.961750] dyndbg: read 21 bytes from userspace [ 9904.962286] dyndbg: query 0: "class DRM_UT_CORE -p" mod:* [ 9904.962848] dyndbg: split into words: "class" "DRM_UT_CORE" "-p" [ 9904.963444] dyndbg: op='-' flags=0x0 maskp=0xfffe [ 9904.963945] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_CORE [ 9904.964781] dyndbg: good-class: drm.DRM_UT_CORE module:drm nd:302 nc:1 nu:0 [ 9904.966411] dyndbg: class-ref: drm_kms_helper.DRM_UT_CORE module:drm_kms_helper nd:95 nc:0 nu:1 [ 9904.967265] dyndbg: class-ref: drm_display_helper.DRM_UT_CORE module:drm_display_helper nd:150 nc:0 nu:1 [ 9904.968349] dyndbg: class-ref: i915.DRM_UT_CORE module:i915 nd:1659 nc:0 nu:1 [ 9904.969801] dyndbg: class-ref: amdgpu.DRM_UT_CORE module:amdgpu nd:4425 nc:0 nu:1 [ 9904.977079] dyndbg: class-ref: nouveau.DRM_UT_CORE module:nouveau nd:103 nc:0 nu:1 [ 9904.977830] dyndbg: processed 1 queries, with 507 matches, 0 errs doing class DRM_UT_DRIVER +p [ 9906.151761] dyndbg: read 23 bytes from userspace [ 9906.152241] dyndbg: query 0: "class DRM_UT_DRIVER +p" mod:* [ 9906.152793] dyndbg: split into words: "class" "DRM_UT_DRIVER" "+p" [ 9906.153388] dyndbg: