[PATCH v7 8/9] drm_print: prefer bare printk KERN_DEBUG on generic fn

2022-09-11 Thread Jim Cromie
drm_print.c calls pr_debug() just once, from __drm_printfn_debug(),
which is a generic/service fn.  The callsite is compile-time enabled
by DEBUG in both DYNAMIC_DEBUG=y/n builds.

For dyndbg builds, reverting this callsite back to bare printk is
correcting a few anti-features:

1- callsite is generic, serves multiple drm users.
   it is soft-wired on currently by #define DEBUG
   could accidentally: #> echo -p > /proc/dynamic_debug/control

2- optional "decorations" by dyndbg are unhelpful/misleading here,
   they describe only the generic site, not end users

IOW, 1,2 are unhelpful at best, and possibly confusing.

reverting yields a nominal data and text shrink:

   textdata bss dec hex filename
 462583   36604   54592 553779   87333 /kernel/drivers/gpu/drm/drm.ko
 462515   36532   54592 553639   872a7 -dirty/kernel/drivers/gpu/drm/drm.ko

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/drm_print.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index cb203d63b286..ec477c44a784 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -23,8 +23,6 @@
  * Rob Clark 
  */
 
-#define DEBUG /* for pr_debug() */
-
 #include 
 
 #include 
@@ -185,7 +183,8 @@ EXPORT_SYMBOL(__drm_printfn_info);
 
 void __drm_printfn_debug(struct drm_printer *p, struct va_format *vaf)
 {
-   pr_debug("%s %pV", p->prefix, vaf);
+   /* pr_debug callsite decorations are unhelpful here */
+   printk(KERN_DEBUG "%s %pV", p->prefix, vaf);
 }
 EXPORT_SYMBOL(__drm_printfn_debug);
 
-- 
2.37.3



[PATCH v7 7/9] drm_print: optimize drm_debug_enabled for jump-label

2022-09-11 Thread Jim Cromie
When CONFIG_DRM_USE_DYNAMIC_DEBUG=y, the drm.debug API (a macro stack,
calling _+drm_*dbg() eventually) invokes a dyndbg Factory macro to
create a descriptor for each callsite, thus making them individually
>control-able.

In this case, the calls to _drm_*dbg are unreachable unless the
callsite is enabled.  So those calls can short-circuit their early
do-nothing returns.  Provide and use __drm_debug_enabled(), to do this
when config'd, or the _raw flags-check otherwize.

And since dyndbg is in use, lets also instrument the remaining users
of drm_debug_enabled, by wrapping the _raw in a macro with a:

  pr_debug("todo: is this frequent enough to optimize ?\n");

For CONFIG_DRM_USE_DYNAMIC_DEBUG=n, do no site instrumenting at all,
since JUMP_LABEL might be off, and we don't want to make work.

With drm, amdgpu, i915, nouveau loaded, heres remaining uses of
drm_debug_enabled(), which costs ~1.5kb data to control the
pr_debug("todo:..")s.

Some of those uses might be ok to use __drm_debug_enabled() by
inspection, others might warrant conversion to use dyndbg Factory
macros, and that would want callrate data to estimate the savings
possible.  TBH, any remaining savings are probably small; drm.debug
covers the vast bulk of the uses.  Maybe "vblank" is the exception.

:#> grep todo /proc/dynamic_debug/control | wc
 21 1682357
:#> grep todo /proc/dynamic_debug/control
drivers/gpu/drm/drm_edid_load.c:178 [drm]edid_load =_ "todo: maybe avoid via 
dyndbg\n"
drivers/gpu/drm/drm_vblank.c:410 [drm]drm_crtc_accurate_vblank_count =_ "todo: 
maybe avoid via dyndbg\n"
drivers/gpu/drm/drm_vblank.c:787 
[drm]drm_crtc_vblank_helper_get_vblank_timestamp_internal =_ "todo: maybe avoid 
via dyndbg\n"
drivers/gpu/drm/drm_vblank.c:1491 [drm]drm_vblank_restore =_ "todo: maybe avoid 
via dyndbg\n"
drivers/gpu/drm/drm_vblank.c:1433 [drm]drm_vblank_enable =_ "todo: maybe avoid 
via dyndbg\n"
drivers/gpu/drm/drm_plane.c:2168 [drm]drm_mode_setplane =_ "todo: maybe avoid 
via dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:1359 
[drm_display_helper]drm_dp_mst_wait_tx_reply =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:2864 
[drm_display_helper]process_single_tx_qlock =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:2909 
[drm_display_helper]drm_dp_queue_down_tx =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:1686 
[drm_display_helper]drm_dp_mst_update_slots =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_dp.c: [i915]intel_dp_print_rates =_ 
"todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_backlight.c:5434 [i915]cnp_enable_backlight 
=_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_backlight.c:5459 
[i915]intel_backlight_device_register =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_opregion.c:43 
[i915]intel_opregion_notify_encoder =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_opregion.c:53 [i915]asle_set_backlight =_ 
"todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_bios.c:1088 [i915]intel_bios_is_dsi_present 
=_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_display_debugfs.c:6153 
[i915]i915_drrs_ctl_set =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/intel_pcode.c:26 [i915]snb_pcode_read =_ "todo: maybe 
avoid via dyndbg\n"
drivers/gpu/drm/i915/i915_getparam.c:785 [i915]i915_getparam_ioctl =_ "todo: 
maybe avoid via dyndbg\n"
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:282 [amdgpu]vcn_v2_5_process_interrupt =_ 
"todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:433 [amdgpu]vcn_v2_0_process_interrupt =_ 
"todo: maybe avoid via dyndbg\n"
:#>

Signed-off-by: Jim Cromie 
---
- simplify drm-debug-enabled choices, @DanVet
---
 drivers/gpu/drm/drm_print.c |  4 ++--
 include/drm/drm_print.h | 21 -
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index 29a29949ad0b..cb203d63b286 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -285,7 +285,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
struct va_format vaf;
va_list args;
 
-   if (!drm_debug_enabled(category))
+   if (!__drm_debug_enabled(category))
return;
 
va_start(args, format);
@@ -308,7 +308,7 @@ void ___drm_dbg(enum drm_debug_category category, const 
char *format, ...)
struct va_format vaf;
va_list args;
 
-   if (!drm_debug_enabled(category))
+   if (!__drm_debug_enabled(category))
return;
 
va_start(args, format);
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index dfdd81c3287c..9af57d3df259 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -321,11 +321,30 @@ enum drm_debug_category {

[PATCH v7 9/9] drm_print: add _ddebug descriptor to drm_*dbg prototypes

2022-09-11 Thread Jim Cromie
upgrade the callchain to drm_dbg() and drm_dev_dbg(); add a struct
_ddebug ptr parameter to them, and supply that additional param by
replacing the '_no_desc' flavor of dyndbg Factory macro currently used
with the flavor that supplies the descriptor.

NOTES:

The descriptor gives these fns access to the decorator flags, but they
do none of the dynamic-prefixing done by dynamic_emit_prefix(), which
is currently static.

DRM already has conventions for logging/messaging; just tossing
optional decorations on top probably wouldn't help.  Instead, existing
flags (or new ones, perhaps 'sd' ala lspci) can be used to make
current message conventions optional.  This suggests a new
drmdbg_prefix_emit() to handle prefixing locally.

For CONFIG_DRM_USE_DYNAMIC_DEBUG=N, just pass null descriptor.

desc->class_id is redundant with category parameter, but its
availability is dependent on desc.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/drm_print.c |  8 +---
 include/drm/drm_print.h | 23 ---
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index ec477c44a784..5b93c11895bb 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -278,8 +279,8 @@ void drm_dev_printk(const struct device *dev, const char 
*level,
 }
 EXPORT_SYMBOL(drm_dev_printk);
 
-void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category,
-  const char *format, ...)
+void __drm_dev_dbg(struct _ddebug *desc, const struct device *dev,
+  enum drm_debug_category category, const char *format, ...)
 {
struct va_format vaf;
va_list args;
@@ -287,6 +288,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
if (!__drm_debug_enabled(category))
return;
 
+   /* we know we are printing for either syslog, tracefs, or both */
va_start(args, format);
vaf.fmt = format;
vaf.va = 
@@ -302,7 +304,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
 }
 EXPORT_SYMBOL(__drm_dev_dbg);
 
-void ___drm_dbg(enum drm_debug_category category, const char *format, ...)
+void ___drm_dbg(struct _ddebug *desc, enum drm_debug_category category, const 
char *format, ...)
 {
struct va_format vaf;
va_list args;
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 9af57d3df259..a44fb7ef257f 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -354,9 +354,10 @@ static inline bool drm_debug_enabled_raw(enum 
drm_debug_category category)
 __printf(3, 4)
 void drm_dev_printk(const struct device *dev, const char *level,
const char *format, ...);
-__printf(3, 4)
-void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category,
-const char *format, ...);
+struct _ddebug;
+__printf(4, 5)
+void __drm_dev_dbg(struct _ddebug *desc, const struct device *dev,
+  enum drm_debug_category category, const char *format, ...);
 
 /**
  * DRM_DEV_ERROR() - Error output.
@@ -406,11 +407,11 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
 
 #if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
 #define drm_dev_dbg(dev, cat, fmt, ...)\
-   __drm_dev_dbg(dev, cat, fmt, ##__VA_ARGS__)
+   __drm_dev_dbg(NULL, dev, cat, fmt, ##__VA_ARGS__)
 #else
 #define drm_dev_dbg(dev, cat, fmt, ...)\
-   _dynamic_func_call_no_desc(fmt, __drm_dev_dbg,  \
-  dev, cat, fmt, ##__VA_ARGS__)
+   _dynamic_func_call_cls(cat, fmt, __drm_dev_dbg, \
+  dev, cat, fmt, ##__VA_ARGS__)
 #endif
 
 /**
@@ -514,17 +515,17 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
  * Prefer drm_device based logging over device or prink based logging.
  */
 
-__printf(2, 3)
-void ___drm_dbg(enum drm_debug_category category, const char *format, ...);
+__printf(3, 4)
+void ___drm_dbg(struct _ddebug *desc, enum drm_debug_category category, const 
char *format, ...);
 __printf(1, 2)
 void __drm_err(const char *format, ...);
 
 #if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
-#define __drm_dbg(fmt, ...)___drm_dbg(fmt, ##__VA_ARGS__)
+#define __drm_dbg(fmt, ...)___drm_dbg(NULL, fmt, ##__VA_ARGS__)
 #else
 #define __drm_dbg(cat, fmt, ...)   \
-   _dynamic_func_call_no_desc(fmt, ___drm_dbg, \
-  cat, fmt, ##__VA_ARGS__)
+   _dynamic_func_call_cls(cat, fmt, ___drm_dbg,\
+  cat, fmt, ##__VA_ARGS__)
 #endif
 
 /* Macros to make printk easier */
-- 
2.37.3



[PATCH v7 6/9] drm-print: add drm_dbg_driver to improve namespace symmetry

2022-09-11 Thread Jim Cromie
drm_print defines all of these:
drm_dbg_{core,kms,prime,atomic,vbl,lease,_dp,_drmres}

but not drm_dbg_driver itself, since it was the original drm_dbg.

To improve namespace symmetry, change the drm_dbg defn to
drm_dbg_driver, and redef grandfathered name to symmetric one.

This will help with nouveau, which uses its own stack of macros to
construct calls to dev_info, dev_dbg, etc, for which adaptation means
drm_dbg_##driver constructs.

Signed-off-by: Jim Cromie 
---
 include/drm/drm_print.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index f8bb3e7158c6..dfdd81c3287c 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -468,7 +468,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
 
 #define drm_dbg_core(drm, fmt, ...)\
drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_CORE, fmt, ##__VA_ARGS__)
-#define drm_dbg(drm, fmt, ...) \
+#define drm_dbg_driver(drm, fmt, ...)  
\
drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, 
##__VA_ARGS__)
 #define drm_dbg_kms(drm, fmt, ...) \
drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_KMS, fmt, ##__VA_ARGS__)
@@ -487,6 +487,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
 #define drm_dbg_drmres(drm, fmt, ...)  \
drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRMRES, fmt, 
##__VA_ARGS__)
 
+#define drm_dbg(drm, fmt, ...) drm_dbg_driver(drm, fmt, ##__VA_ARGS__)
 
 /*
  * printk based logging
-- 
2.37.3



[PATCH v7 1/9] drm_print: condense enum drm_debug_category

2022-09-11 Thread Jim Cromie
enum drm_debug_category has 10 categories, but is initialized with
bitmasks which require 10 bits of underlying storage.  By using
natural enumeration, and moving the BIT(cat) into drm_debug_enabled(),
the enum fits in 4 bits, allowing the category to be represented
directly in pr_debug callsites, via the ddebug.class_id field.

While this slightly pessimizes the bit-test in drm_debug_enabled(),
using dyndbg with JUMP_LABEL will avoid the function entirely.

NOTE: this change forecloses the possibility of doing:

  drm_dbg(DRM_UT_CORE|DRM_UT_KMS, "weird 2-cat experiment")

but thats already strongly implied by the use of the enum itself; its
not a normal enum if it can be 2 values simultaneously.

Signed-off-by: Jim Cromie 
---
 include/drm/drm_print.h | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 22fabdeed297..b3b470440e46 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -279,49 +279,49 @@ enum drm_debug_category {
 * @DRM_UT_CORE: Used in the generic drm code: drm_ioctl.c, drm_mm.c,
 * drm_memory.c, ...
 */
-   DRM_UT_CORE = 0x01,
+   DRM_UT_CORE,
/**
 * @DRM_UT_DRIVER: Used in the vendor specific part of the driver: i915,
 * radeon, ... macro.
 */
-   DRM_UT_DRIVER   = 0x02,
+   DRM_UT_DRIVER,
/**
 * @DRM_UT_KMS: Used in the modesetting code.
 */
-   DRM_UT_KMS  = 0x04,
+   DRM_UT_KMS,
/**
 * @DRM_UT_PRIME: Used in the prime code.
 */
-   DRM_UT_PRIME= 0x08,
+   DRM_UT_PRIME,
/**
 * @DRM_UT_ATOMIC: Used in the atomic code.
 */
-   DRM_UT_ATOMIC   = 0x10,
+   DRM_UT_ATOMIC,
/**
 * @DRM_UT_VBL: Used for verbose debug message in the vblank code.
 */
-   DRM_UT_VBL  = 0x20,
+   DRM_UT_VBL,
/**
 * @DRM_UT_STATE: Used for verbose atomic state debugging.
 */
-   DRM_UT_STATE= 0x40,
+   DRM_UT_STATE,
/**
 * @DRM_UT_LEASE: Used in the lease code.
 */
-   DRM_UT_LEASE= 0x80,
+   DRM_UT_LEASE,
/**
 * @DRM_UT_DP: Used in the DP code.
 */
-   DRM_UT_DP   = 0x100,
+   DRM_UT_DP,
/**
 * @DRM_UT_DRMRES: Used in the drm managed resources code.
 */
-   DRM_UT_DRMRES   = 0x200,
+   DRM_UT_DRMRES
 };
 
 static inline bool drm_debug_enabled(enum drm_debug_category category)
 {
-   return unlikely(__drm_debug & category);
+   return unlikely(__drm_debug & BIT(category));
 }
 
 /*
-- 
2.37.3



[PATCH v7 5/9] drm-print.h: include dyndbg header

2022-09-11 Thread Jim Cromie
lkp robot told me:

  >> drivers/gpu/drm/drm_ioc32.c:989:2:
  error: call to undeclared function '_dynamic_func_call_cls';
  ISO C99 and later do not support implicit function declarations
  [-Wimplicit-function-declaration]

   DRM_DEBUG("comm=\"%s\", pid=%d, dev=0x%lx, auth=%d, %s\n",

Since that macro is defined in drm_print.h, and under DRM_USE_DYN*=y
configs, invokes dyndbg-factory macros, include dynamic_debug.h from
there too, so that those configs have the definitions of all the
macros in the callchain.

This is done as a separate patch mostly to see how lkp sorts it.

Signed-off-by: Jim Cromie 
---
 include/drm/drm_print.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 2d2cef76b5c1..f8bb3e7158c6 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
-- 
2.37.3



[PATCH v7 4/9] drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro

2022-09-11 Thread Jim Cromie
For CONFIG_DRM_USE_DYNAMIC_DEBUG=y, wrap __drm_dbg() & __drm_dev_dbg()
in one of dyndbg's Factory macros: _dynamic_func_call_no_desc().

This adds the callsite descriptor into the code, and an entry for each
into /proc/dynamic_debug/control.

  #> echo class DRM_UT_ATOMIC +p > /proc/dynamic_debug/control

CONFIG_DRM_USE_DYNAMIC_DEBUG=y/n is configurable because of the .data
footprint cost of per-callsite control; 56 bytes/site * ~2k for i915,
~4k callsites for amdgpu.  This is large enough that a kernel builder
might not want it.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/Kconfig  | 12 
 drivers/gpu/drm/Makefile |  2 ++
 include/drm/drm_print.h  | 12 
 3 files changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 6c2256e8474b..2438e0dccfa1 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -50,6 +50,18 @@ config DRM_DEBUG_MM
 
  If in doubt, say "N".
 
+config DRM_USE_DYNAMIC_DEBUG
+   bool "use dynamic debug to implement drm.debug"
+   default y
+   depends on DRM
+   depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE
+   depends on JUMP_LABEL
+   help
+ Use dynamic-debug to avoid drm_debug_enabled() runtime overheads.
+ Due to callsite counts in DRM drivers (~4k in amdgpu) and 56
+ bytes per callsite, the .data costs can be substantial, and
+ are therefore configurable.
+
 config DRM_DEBUG_SELFTEST
tristate "kselftests for DRM"
depends on DRM
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index e7af358e6dda..6828197967a6 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -3,6 +3,8 @@
 # Makefile for the drm device driver.  This driver provides support for the
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
+CFLAGS-$(CONFIG_DRM_USE_DYNAMIC_DEBUG) += -DDYNAMIC_DEBUG_MODULE
+
 drm-y   := drm_aperture.o drm_auth.o drm_cache.o \
drm_file.o drm_gem.o drm_ioctl.o \
drm_drv.o \
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index c429c258c957..2d2cef76b5c1 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -384,8 +384,14 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
}   \
 })
 
+#if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
 #define drm_dev_dbg(dev, cat, fmt, ...)\
__drm_dev_dbg(dev, cat, fmt, ##__VA_ARGS__)
+#else
+#define drm_dev_dbg(dev, cat, fmt, ...)\
+   _dynamic_func_call_no_desc(fmt, __drm_dev_dbg,  \
+  dev, cat, fmt, ##__VA_ARGS__)
+#endif
 
 /**
  * DRM_DEV_DEBUG() - Debug output for generic drm code
@@ -492,7 +498,13 @@ void ___drm_dbg(enum drm_debug_category category, const 
char *format, ...);
 __printf(1, 2)
 void __drm_err(const char *format, ...);
 
+#if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
 #define __drm_dbg(fmt, ...)___drm_dbg(fmt, ##__VA_ARGS__)
+#else
+#define __drm_dbg(cat, fmt, ...)   \
+   _dynamic_func_call_no_desc(fmt, ___drm_dbg, \
+  cat, fmt, ##__VA_ARGS__)
+#endif
 
 /* Macros to make printk easier */
 
-- 
2.37.3



[PATCH v7 3/9] drm_print: interpose drm_*dbg with forwarding macros

2022-09-11 Thread Jim Cromie
change drm_dev_dbg & drm_dbg to macros, which forward to the renamed
functions (with __ prefix added).

Those functions sit below the categorized layer of macros implementing
the DRM debug.category API, and implement most of it.  These are good
places to insert dynamic-debug jump-label mechanics, which will allow
DRM to avoid the runtime cost of drm_debug_enabled().

no functional changes.

memory cost baseline: (unchanged)
bash-5.1# drms_load
[9.220389] dyndbg:   1 debug prints in module drm
[9.224426] ACPI: bus type drm_connector registered
[9.302192] dyndbg:   2 debug prints in module ttm
[9.305033] dyndbg:   8 debug prints in module video
[9.627563] dyndbg: 127 debug prints in module i915
[9.721505] AMD-Vi: AMD IOMMUv2 functionality not available on this system - 
This is not a bug.
[   10.091345] dyndbg: 2196 debug prints in module amdgpu
[   10.106589] [drm] amdgpu kernel modesetting enabled.
[   10.107270] amdgpu: CRAT table not found
[   10.107926] amdgpu: Virtual CRAT table created for CPU
[   10.108398] amdgpu: Topology: Add CPU node
[   10.168507] dyndbg:   3 debug prints in module wmi
[   10.329587] dyndbg:   3 debug prints in module nouveau

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/drm_print.c | 10 +-
 include/drm/drm_print.h |  9 +++--
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index ec32df35a3e3..29a29949ad0b 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -279,8 +279,8 @@ void drm_dev_printk(const struct device *dev, const char 
*level,
 }
 EXPORT_SYMBOL(drm_dev_printk);
 
-void drm_dev_dbg(const struct device *dev, enum drm_debug_category category,
-const char *format, ...)
+void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category,
+  const char *format, ...)
 {
struct va_format vaf;
va_list args;
@@ -301,9 +301,9 @@ void drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
 
va_end(args);
 }
-EXPORT_SYMBOL(drm_dev_dbg);
+EXPORT_SYMBOL(__drm_dev_dbg);
 
-void __drm_dbg(enum drm_debug_category category, const char *format, ...)
+void ___drm_dbg(enum drm_debug_category category, const char *format, ...)
 {
struct va_format vaf;
va_list args;
@@ -320,7 +320,7 @@ void __drm_dbg(enum drm_debug_category category, const char 
*format, ...)
 
va_end(args);
 }
-EXPORT_SYMBOL(__drm_dbg);
+EXPORT_SYMBOL(___drm_dbg);
 
 void __drm_err(const char *format, ...)
 {
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 668273e36c2c..c429c258c957 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -335,7 +335,7 @@ __printf(3, 4)
 void drm_dev_printk(const struct device *dev, const char *level,
const char *format, ...);
 __printf(3, 4)
-void drm_dev_dbg(const struct device *dev, enum drm_debug_category category,
+void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category,
 const char *format, ...);
 
 /**
@@ -384,6 +384,9 @@ void drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
}   \
 })
 
+#define drm_dev_dbg(dev, cat, fmt, ...)\
+   __drm_dev_dbg(dev, cat, fmt, ##__VA_ARGS__)
+
 /**
  * DRM_DEV_DEBUG() - Debug output for generic drm code
  *
@@ -485,10 +488,12 @@ void drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
  */
 
 __printf(2, 3)
-void __drm_dbg(enum drm_debug_category category, const char *format, ...);
+void ___drm_dbg(enum drm_debug_category category, const char *format, ...);
 __printf(1, 2)
 void __drm_err(const char *format, ...);
 
+#define __drm_dbg(fmt, ...)___drm_dbg(fmt, ##__VA_ARGS__)
+
 /* Macros to make printk easier */
 
 #define _DRM_PRINTK(once, level, fmt, ...) \
-- 
2.37.3



[PATCH v7 2/9] drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.

2022-09-11 Thread Jim Cromie
Use DECLARE_DYNDBG_CLASSMAP across DRM:

 - in .c files, since macro defines/initializes a record

 - in drivers, $mod_{drv,drm,param}.c
   ie where param setup is done, since a classmap is param related

 - in drm/drm_print.c
   since existing __drm_debug param is defined there,
   and we ifdef it, and provide an elaborated alternative.

 - in drm_*_helper modules:
   dp/drm_dp - 1st item in makefile target
   drivers/gpu/drm/drm_crtc_helper.c - random pick iirc.

Since these modules all use identical CLASSMAP declarations (ie: names
and .class_id's) they will all respond together to "class DRM_UT_*"
query-commands:

  :#> echo class DRM_UT_KMS +p > /proc/dynamic_debug/control

NOTES:

This changes __drm_debug from int to ulong, so BIT() is usable on it.

DRM's enum drm_debug_category values need to sync with the index of
their respective class-names here.  Then .class_id == category, and
dyndbg's class FOO mechanisms will enable drm_dbg(DRM_UT_KMS, ...).

Though DRM needs consistent categories across all modules, thats not
generally needed; modules X and Y could define FOO differently (ie a
different NAME => class_id mapping), changes are made according to
each module's private class-map.

No callsites are actually selected by this patch, since none are
class'd yet.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 +
 drivers/gpu/drm/display/drm_dp_helper.c | 13 
 drivers/gpu/drm/drm_crtc_helper.c   | 13 
 drivers/gpu/drm/drm_print.c | 27 +++--
 drivers/gpu/drm/i915/i915_params.c  | 12 +++
 drivers/gpu/drm/nouveau/nouveau_drm.c   | 13 
 include/drm/drm_print.h |  3 ++-
 7 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 429fcdf28836..5f091cb52de2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -38,6 +38,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "amdgpu.h"
 #include "amdgpu_irq.h"
@@ -185,6 +187,18 @@ int amdgpu_vcnfw_log;
 
 static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work);
 
+DECLARE_DYNDBG_CLASSMAP(drm_debug_classes, DD_CLASS_TYPE_DISJOINT_BITS, 0,
+   "DRM_UT_CORE",
+   "DRM_UT_DRIVER",
+   "DRM_UT_KMS",
+   "DRM_UT_PRIME",
+   "DRM_UT_ATOMIC",
+   "DRM_UT_VBL",
+   "DRM_UT_STATE",
+   "DRM_UT_LEASE",
+   "DRM_UT_DP",
+   "DRM_UT_DRMRES");
+
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
.delayed_reset_work = __DELAYED_WORK_INITIALIZER(
diff --git a/drivers/gpu/drm/display/drm_dp_helper.c 
b/drivers/gpu/drm/display/drm_dp_helper.c
index e5bab236b3ae..196dfb1e8d87 100644
--- a/drivers/gpu/drm/display/drm_dp_helper.c
+++ b/drivers/gpu/drm/display/drm_dp_helper.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -40,6 +41,18 @@
 
 #include "drm_dp_helper_internal.h"
 
+DECLARE_DYNDBG_CLASSMAP(drm_debug_classes, DD_CLASS_TYPE_DISJOINT_BITS, 0,
+   "DRM_UT_CORE",
+   "DRM_UT_DRIVER",
+   "DRM_UT_KMS",
+   "DRM_UT_PRIME",
+   "DRM_UT_ATOMIC",
+   "DRM_UT_VBL",
+   "DRM_UT_STATE",
+   "DRM_UT_LEASE",
+   "DRM_UT_DP",
+   "DRM_UT_DRMRES");
+
 struct dp_aux_backlight {
struct backlight_device *base;
struct drm_dp_aux *aux;
diff --git a/drivers/gpu/drm/drm_crtc_helper.c 
b/drivers/gpu/drm/drm_crtc_helper.c
index 8a6d54515f92..a8cee6694cf6 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -51,6 +52,18 @@
 
 #include "drm_crtc_helper_internal.h"
 
+DECLARE_DYNDBG_CLASSMAP(drm_debug_classes, DD_CLASS_TYPE_DISJOINT_BITS, 0,
+   "DRM_UT_CORE",
+   "DRM_UT_DRIVER",
+   "DRM_UT_KMS",
+   "DRM_UT_PRIME",
+   "DRM_UT_ATOMIC",
+   "DRM_UT_VBL",
+   "DRM_UT_STATE",
+   "DRM_UT_LEASE",
+   "DRM_UT_DP",
+   "DRM_UT_DRMRES");
+
 /**
  * DOC: overview
  *
diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index f783d4963d4b..ec32df35a3e3 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -40,7 +40,7 @@
  * __drm_debug: Enable debug output.
  * Bitmask of DRM_UT_x. See include/drm/drm_print.h for details.
  */

[PATCH v7 0/9] dyndbg: drm.debug adaptation

2022-09-11 Thread Jim Cromie
hi Greg, Dan, Jason, DRM-folk,

heres follow-up to V6:
  rebased on driver-core/driver-core-next for -v6 applied bits (thanks)
  rework drm_debug_enabled{_raw,_instrumented,} per Dan.

It excludes:
  nouveau parts (immature)
  tracefs parts (I missed --to=Steve on v6)
  split _ddebug_site and de-duplicate experiment (way unready)

IOW, its the remaining commits of V6 on which Dan gave his Reviewed-by.

If these are good to apply, I'll rebase and repost the rest separately.

These are also available at:
https://github.com/jimc/linux/releases/tag/dyndbg%2Fdd-drm-class-911

RFC:

DECLARE_DYNDBG_CLASSMAP's interface can be improved: class-names are
currently strings, like "DRM_UT_CORE", which must have corresponding
ENUM symbols defined.  It would be better to just pass DRM_UT_CORE,
etc, and stringify the va-args inside the macro while initializing
classnames member.  But how ?


Jim Cromie (9):
  drm_print: condense enum drm_debug_category
  drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
  drm_print: interpose drm_*dbg with forwarding macros
  drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
  drm-print.h: include dyndbg header
  drm-print: add drm_dbg_driver to improve namespace symmetry
  drm_print: optimize drm_debug_enabled for jump-label
  drm_print: prefer bare printk KERN_DEBUG on generic fn
  drm_print: add _ddebug descriptor to drm_*dbg prototypes

 drivers/gpu/drm/Kconfig | 12 
 drivers/gpu/drm/Makefile|  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 +
 drivers/gpu/drm/display/drm_dp_helper.c | 13 +
 drivers/gpu/drm/drm_crtc_helper.c   | 13 +
 drivers/gpu/drm/drm_print.c | 48 +++
 drivers/gpu/drm/i915/i915_params.c  | 12 
 drivers/gpu/drm/nouveau/nouveau_drm.c   | 13 +
 include/drm/drm_print.h | 78 +++--
 9 files changed, 174 insertions(+), 31 deletions(-)

-- 
2.37.3



Re: [PATCH 1/7] drm/amdgpu: move nbio remap_hdp_registers() to gmc9 code

2022-09-11 Thread Lazar, Lijo




On 9/9/2022 11:05 PM, Alex Deucher wrote:

On Fri, Sep 9, 2022 at 1:17 PM Lazar, Lijo  wrote:




On 9/9/2022 10:17 PM, Alex Deucher wrote:

This is where it is used, so move it into gmc init so


It's only the *side effect* of GMC IP init process, but that doesn't
mean these IPs are interlinked. Any IP init process which requires HDP
flush also would need this. It is not a good idea to couple HDP remap
with GMC especially when there exists a HDP data path way without
setting up GMC (MM INDEX/DATA).


We have no need for HDP flush at all without vram, and we only have
access to vram once GMC is initialized so it is sort of a dependency
in that regard.  We also call a bunch of the HDP callbacks in the GMC
code and I think those are sort of the boat.  Also, the whole reason
we are in this situation is because we need to init GMC before all
other HW because all other hardware has a dependency on being able to
access GPU memory.



We do have early VRAM access usecases over HDP to fixed offsets for 
discovery region, 2-stage memory training etc. So far there is no 
requirement for flush, or flush might be happening indirectly because of 
a register access. That doesn't rule out any future requirements for 
explicit HDP flush. Prefer to keep HDP and GMC programming separate.


Thanks,
Lijo



  From a generic software perspective, I think programming pre-requisite
for HDP flush need to be standalone and the order needs to be guaranteed
before any client IPs that make use of it.


In that case patches 5, 6, 7 could be an alternative.

Alex



Thanks,
Lijo


that it will always be initialized in the right order.
We already do this for other nbio and hdp callbacks so
it's consistent with what we do on other IPs.

This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.

Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373data=05%7C01%7Clijo.lazar%40amd.com%7C984f5015c4104040ca1d08da9289c85d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637983417715604666%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=BJL7OWuAlaOzA%2F2G%2BYSzkdtaO3TmYwRK1gAsw26pW1U%3Dreserved=0

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in 
get_port_device_capability()")

Signed-off-by: Alex Deucher 
---
   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
   drivers/gpu/drm/amd/amdgpu/soc15.c| 7 ---
   2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 4603653916f5..3a4b0a475672 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1819,6 +1819,13 @@ static int gmc_v9_0_hw_init(void *handle)
   bool value;
   int i, r;

+ /* remap HDP registers to a hole in mmio space,
+  * for the purpose of expose those registers
+  * to process space
+  */
+ if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
+ adev->nbio.funcs->remap_hdp_registers(adev);
+
   /* The sequence of these two function calls matters.*/
   gmc_v9_0_init_golden_registers(adev);

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 5188da87428d..39c3c6d65aef 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,13 +1240,6 @@ static int soc15_common_hw_init(void *handle)
   soc15_program_aspm(adev);
   /* setup nbio registers */
   adev->nbio.funcs->init_registers(adev);
- /* remap HDP registers to a hole in mmio space,
-  * for the purpose of expose those registers
-  * to process space
-  */
- if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
- adev->nbio.funcs->remap_hdp_registers(adev);
-
   /* enable the doorbell aperture */
   soc15_enable_doorbell_aperture(adev, true);
   /* HW doorbell routing policy: doorbell writing not



Re: [PATCH 0/5] drm/amd/display: Reduce stack usage for clang

2022-09-11 Thread Maíra Canal
Hi Nathan,

I have built-tested the whole series with clang 14.0.5 (Fedora
14.0.5-1.fc36), using:

$ make -kj"$(nproc)" ARCH=x86_64 LLVM=1 mrproper allmodconfig
drivers/gpu/drm/amd/amdgpu/

Great to see this patchset coming for DML!

To the whole series:

Tested-by: Maíra Canal 

Best Regards,
- Maíra Canal

On 8/30/22 17:34, Nathan Chancellor wrote:
> Hi all,
> 
> This series aims to address the following warnings, which are visible
> when building x86_64 allmodconfig with clang after commit 3876a8b5e241
> ("drm/amd/display: Enable building new display engine with KCOV
> enabled").
> 
> 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
>  error: stack frame size (2200) exceeds limit (2048) in 
> 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
> *mode_lib)
> ^
> 1 error generated.
> 
> 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
>  error: stack frame size (2216) exceeds limit (2048) in 
> 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
> *mode_lib)
> ^
> 1 error generated.
> 
> 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6:
>  error: stack frame size (2152) exceeds limit (2048) in 
> 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
> *mode_lib)
> ^
> 1 error generated.
> 
> This series is based on commit b3235e8635e1 ("drm/amd/display: clean up
> some inconsistent indentings"). These warnings are fatal for
> allmodconfig due to CONFIG_WERROR so ideally, I would like to see these
> patches cherry-picked to a branch targeting mainline to allow our builds
> to go back to green. However, since this series is not exactly trivial
> in size, I can understand not wanting to apply these to mainline during
> the -rc cycle. If they cannot be cherry-picked to mainline, I can add a
> patch raising the value of -Wframe-larger-than for these files that can
> be cherry-picked to 6.0/mainline then add a revert of that change as the
> last patch in the stack so everything goes back to normal for -next/6.1.
> I am open to other options though!
> 
> I have built this series against clang 16.0.0 (ToT) and GCC 12.2.0 for
> x86_64. It has seen no runtime testing, as my only test system with AMD
> graphics is a Renoir one, which as far as I understand it uses DCN 2.1.
> 
> Nathan Chancellor (5):
>   drm/amd/display: Reduce number of arguments of
> dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
>   drm/amd/display: Reduce number of arguments of
> dml32_CalculatePrefetchSchedule()
>   drm/amd/display: Reduce number of arguments of dml31's
> CalculateWatermarksAndDRAMSpeedChangeSupport()
>   drm/amd/display: Reduce number of arguments of dml31's
> CalculateFlipSchedule()
>   drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack
> usage
> 
>  .../dc/dml/dcn30/display_mode_vba_30.c|   2 +-
>  .../dc/dml/dcn31/display_mode_vba_31.c| 420 +-
>  .../dc/dml/dcn32/display_mode_vba_32.c| 236 +++---
>  .../dc/dml/dcn32/display_mode_vba_util_32.c   | 323 ++
>  .../dc/dml/dcn32/display_mode_vba_util_32.h   |  51 +--
>  5 files changed, 318 insertions(+), 714 deletions(-)
> 
> 
> base-commit: b3235e8635e1dd7ac1a27a73330e9880dfe05154