Re: [PATCH v6 28/57] drm_print: refine drm_debug_enabled for jump-label

2022-09-09 Thread jim . cromie
On Wed, Sep 7, 2022 at 12:40 AM Daniel Vetter  wrote:
>
> On Sun, Sep 04, 2022 at 03:41:05PM -0600, Jim Cromie wrote:
> > In order to use dynamic-debug's jump-label optimization in drm-debug,
> > its clarifying to refine drm_debug_enabled into 3 uses:
> >
> > 1.   drm_debug_enabled - legacy, public
> > 2. __drm_debug_enabled - optimized for dyndbg jump-label enablement.
> > 3.  _drm_debug_enabled - pr_debug instrumented, observable
> >
> > 1. The legacy version always checks the bits.
> >
> > 2. is privileged, for use by __drm_dbg(), __drm_dev_dbg(), which do an
> > early return unless the category is enabled.  For dyndbg builds, debug
> > callsites are selectively "pre-enabled", so __drm_debug_enabled()
> > short-circuits to true there.  Remaining callers of 1 may be able to
> > use 2, case by case.
> >
> > 3. is 1st wrapped in a macro, with a pr_debug, which reports each
> > usage in /proc/dynamic_debug/control, making it observable in the
> > logs.  The macro lets the pr_debug see the real caller, not an inline
> > function.
> >
> > When plugged into 1, 3 identified ~10 remaining callers of the
> > function, leading to the follow-on cleanup patch, and would allow
> > activating the pr_debugs, estimating the callrate, and the potential
> > savings by using the wrapper macro.  It is unused ATM, but it fills
> > out the picture.
> >
> > Signed-off-by: Jim Cromie 
>
> So instead of having 3 here as a "you need to hack it in to see what
> should be converted" I have a bit a different idea: Could we make the
> public version also a dyndbg callsite (like the printing wrappers), but
> instead of a dynamic call we'd have a dynamically fixed value we get out?
> I think that would take care of everything you have here as an open.
>
> Otherwise I'd just drop 3 for the series we're going to merge.
> -Daniel
>

OK - So here it is in use again,  with  modules drm amdgpu i915 loaded + deps

:#> grep todo /proc/dynamic_debug/control
drivers/gpu/drm/drm_edid_load.c:178 [drm]edid_load =_ "todo: maybe
avoid via dyndbg\n"
drivers/gpu/drm/drm_vblank.c:410 [drm]drm_crtc_accurate_vblank_count
=_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/drm_vblank.c:787
[drm]drm_crtc_vblank_helper_get_vblank_timestamp_internal =_ "todo:
maybe avoid via dyndbg\n"
drivers/gpu/drm/drm_vblank.c:1491 [drm]drm_vblank_restore =_ "todo:
maybe avoid via dyndbg\n"
drivers/gpu/drm/drm_vblank.c:1433 [drm]drm_vblank_enable =_ "todo:
maybe avoid via dyndbg\n"
drivers/gpu/drm/drm_plane.c:2168 [drm]drm_mode_setplane =_ "todo:
maybe avoid via dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:1359
[drm_display_helper]drm_dp_mst_wait_tx_reply =_ "todo: maybe avoid via
dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:2864
[drm_display_helper]process_single_tx_qlock =_ "todo: maybe avoid via
dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:2909
[drm_display_helper]drm_dp_queue_down_tx =_ "todo: maybe avoid via
dyndbg\n"
drivers/gpu/drm/display/drm_dp_mst_topology.c:1686
[drm_display_helper]drm_dp_mst_update_slots =_ "todo: maybe avoid via
dyndbg\n"
drivers/gpu/drm/i915/display/intel_dp.c:
[i915]intel_dp_print_rates =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_backlight.c:5434
[i915]cnp_enable_backlight =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_backlight.c:5459
[i915]intel_backlight_device_register =_ "todo: maybe avoid via
dyndbg\n"
drivers/gpu/drm/i915/display/intel_opregion.c:43
[i915]intel_opregion_notify_encoder =_ "todo: maybe avoid via
dyndbg\n"
drivers/gpu/drm/i915/display/intel_opregion.c:53
[i915]asle_set_backlight =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_bios.c:1088
[i915]intel_bios_is_dsi_present =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/display/intel_display_debugfs.c:6153
[i915]i915_drrs_ctl_set =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/intel_pcode.c:26 [i915]snb_pcode_read =_ "todo:
maybe avoid via dyndbg\n"
drivers/gpu/drm/i915/i915_getparam.c:785 [i915]i915_getparam_ioctl =_
"todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:282
[amdgpu]vcn_v2_5_process_interrupt =_ "todo: maybe avoid via dyndbg\n"
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:433
[amdgpu]vcn_v2_0_process_interrupt =_ "todo: maybe avoid via dyndbg\n"

w/o actually looking, the vblank debug could be called frequently.
I'll build on my amdgpu box to run on real hardware.

And Im inclined to restore the instrumented version (with the "todo:")
care to suggest a better message than "maybe avoid" ?


Re: [PATCH v6 28/57] drm_print: refine drm_debug_enabled for jump-label

2022-09-07 Thread Daniel Vetter
On Sun, Sep 04, 2022 at 03:41:05PM -0600, Jim Cromie wrote:
> In order to use dynamic-debug's jump-label optimization in drm-debug,
> its clarifying to refine drm_debug_enabled into 3 uses:
> 
> 1.   drm_debug_enabled - legacy, public
> 2. __drm_debug_enabled - optimized for dyndbg jump-label enablement.
> 3.  _drm_debug_enabled - pr_debug instrumented, observable
> 
> 1. The legacy version always checks the bits.
> 
> 2. is privileged, for use by __drm_dbg(), __drm_dev_dbg(), which do an
> early return unless the category is enabled.  For dyndbg builds, debug
> callsites are selectively "pre-enabled", so __drm_debug_enabled()
> short-circuits to true there.  Remaining callers of 1 may be able to
> use 2, case by case.
> 
> 3. is 1st wrapped in a macro, with a pr_debug, which reports each
> usage in /proc/dynamic_debug/control, making it observable in the
> logs.  The macro lets the pr_debug see the real caller, not an inline
> function.
> 
> When plugged into 1, 3 identified ~10 remaining callers of the
> function, leading to the follow-on cleanup patch, and would allow
> activating the pr_debugs, estimating the callrate, and the potential
> savings by using the wrapper macro.  It is unused ATM, but it fills
> out the picture.
> 
> Signed-off-by: Jim Cromie 

So instead of having 3 here as a "you need to hack it in to see what
should be converted" I have a bit a different idea: Could we make the
public version also a dyndbg callsite (like the printing wrappers), but
instead of a dynamic call we'd have a dynamically fixed value we get out?
I think that would take care of everything you have here as an open.

Otherwise I'd just drop 3 for the series we're going to merge.
-Daniel

> ---
>  drivers/gpu/drm/drm_print.c |  4 ++--
>  include/drm/drm_print.h | 28 
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 29a29949ad0b..cb203d63b286 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -285,7 +285,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
> drm_debug_category category,
>   struct va_format vaf;
>   va_list args;
>  
> - if (!drm_debug_enabled(category))
> + if (!__drm_debug_enabled(category))
>   return;
>  
>   va_start(args, format);
> @@ -308,7 +308,7 @@ void ___drm_dbg(enum drm_debug_category category, const 
> char *format, ...)
>   struct va_format vaf;
>   va_list args;
>  
> - if (!drm_debug_enabled(category))
> + if (!__drm_debug_enabled(category))
>   return;
>  
>   va_start(args, format);
> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> index dfdd81c3287c..7631b5fb669e 100644
> --- a/include/drm/drm_print.h
> +++ b/include/drm/drm_print.h
> @@ -321,11 +321,39 @@ enum drm_debug_category {
>   DRM_UT_DRMRES
>  };
>  
> +/*
> + * 3 name flavors of drm_debug_enabled:
> + *   drm_debug_enabled - public/legacy, always checks bits
> + *  _drm_debug_enabled - instrumented to observe call-rates, est overheads.
> + * __drm_debug_enabled - privileged - knows jump-label state, can 
> short-circuit
> + */
>  static inline bool drm_debug_enabled(enum drm_debug_category category)
>  {
>   return unlikely(__drm_debug & BIT(category));
>  }
>  
> +/*
> + * Wrap fn in macro, so that the pr_debug sees the actual caller, not
> + * the inline fn.  Using this name creates a callsite entry / control
> + * point in /proc/dynamic_debug/control.
> + */
> +#define _drm_debug_enabled(category) \
> + ({  \
> + pr_debug("todo: maybe avoid via dyndbg\n"); \
> + drm_debug_enabled(category);\
> + })
> +
> +#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
> +/*
> + * dyndbg is wrapping the drm.debug API, so as to avoid the runtime
> + * bit-test overheads of drm_debug_enabled() in those api calls.
> + * In this case, executed callsites are known enabled, so true.
> + */
> +#define __drm_debug_enabled(category)true
> +#else
> +#define __drm_debug_enabled(category)drm_debug_enabled(category)
> +#endif
> +
>  /*
>   * struct device based logging
>   *
> -- 
> 2.37.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH v6 28/57] drm_print: refine drm_debug_enabled for jump-label

2022-09-04 Thread Jim Cromie
In order to use dynamic-debug's jump-label optimization in drm-debug,
its clarifying to refine drm_debug_enabled into 3 uses:

1.   drm_debug_enabled - legacy, public
2. __drm_debug_enabled - optimized for dyndbg jump-label enablement.
3.  _drm_debug_enabled - pr_debug instrumented, observable

1. The legacy version always checks the bits.

2. is privileged, for use by __drm_dbg(), __drm_dev_dbg(), which do an
early return unless the category is enabled.  For dyndbg builds, debug
callsites are selectively "pre-enabled", so __drm_debug_enabled()
short-circuits to true there.  Remaining callers of 1 may be able to
use 2, case by case.

3. is 1st wrapped in a macro, with a pr_debug, which reports each
usage in /proc/dynamic_debug/control, making it observable in the
logs.  The macro lets the pr_debug see the real caller, not an inline
function.

When plugged into 1, 3 identified ~10 remaining callers of the
function, leading to the follow-on cleanup patch, and would allow
activating the pr_debugs, estimating the callrate, and the potential
savings by using the wrapper macro.  It is unused ATM, but it fills
out the picture.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/drm_print.c |  4 ++--
 include/drm/drm_print.h | 28 
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index 29a29949ad0b..cb203d63b286 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -285,7 +285,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
struct va_format vaf;
va_list args;
 
-   if (!drm_debug_enabled(category))
+   if (!__drm_debug_enabled(category))
return;
 
va_start(args, format);
@@ -308,7 +308,7 @@ void ___drm_dbg(enum drm_debug_category category, const 
char *format, ...)
struct va_format vaf;
va_list args;
 
-   if (!drm_debug_enabled(category))
+   if (!__drm_debug_enabled(category))
return;
 
va_start(args, format);
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index dfdd81c3287c..7631b5fb669e 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -321,11 +321,39 @@ enum drm_debug_category {
DRM_UT_DRMRES
 };
 
+/*
+ * 3 name flavors of drm_debug_enabled:
+ *   drm_debug_enabled - public/legacy, always checks bits
+ *  _drm_debug_enabled - instrumented to observe call-rates, est overheads.
+ * __drm_debug_enabled - privileged - knows jump-label state, can short-circuit
+ */
 static inline bool drm_debug_enabled(enum drm_debug_category category)
 {
return unlikely(__drm_debug & BIT(category));
 }
 
+/*
+ * Wrap fn in macro, so that the pr_debug sees the actual caller, not
+ * the inline fn.  Using this name creates a callsite entry / control
+ * point in /proc/dynamic_debug/control.
+ */
+#define _drm_debug_enabled(category)   \
+   ({  \
+   pr_debug("todo: maybe avoid via dyndbg\n"); \
+   drm_debug_enabled(category);\
+   })
+
+#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
+/*
+ * dyndbg is wrapping the drm.debug API, so as to avoid the runtime
+ * bit-test overheads of drm_debug_enabled() in those api calls.
+ * In this case, executed callsites are known enabled, so true.
+ */
+#define __drm_debug_enabled(category)  true
+#else
+#define __drm_debug_enabled(category)  drm_debug_enabled(category)
+#endif
+
 /*
  * struct device based logging
  *
-- 
2.37.2