On Mon, Nov 03, 2025 at 02:18:12PM -0300, Gustavo Sousa wrote:
> From: Vinod Govindapillai <[email protected]>
> 
> Configure one of the FBC instances to use system caching. FBC
> read/write requests are tagged as cacheable till a programmed
> limit is reached by the hw.

What exactly is "system caching?"  We have lots of different caches in
current platforms, and it's not really obvious to me from the
description here (or the bspec page) exactly which cache(s) are involved
here.

Is turning this on always a win or is it situational?  I.e., is there
any potential for display memory traffic to fill a cache with FBC data
by evicting data that was part of the CPU or GT's working set?  If so,
that seems like it could potentially harm the performance of other
workloads running on the platform.

Or is this whole thing about a completely new cache (unrelated to
and unusable by anything else) which is devoted solely to FBC?

> 
> Bspec: 74722

You might want to add 68881 here since it has a bit more information
about how we're actually supposed to set the fields documented on 74722.

> Signed-off-by: Vinod Govindapillai <[email protected]>
> Signed-off-by: Gustavo Sousa <[email protected]>
> ---
>  drivers/gpu/drm/i915/display/intel_fbc.c      | 47 
> +++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/display/intel_fbc_regs.h |  9 +++++
>  2 files changed, 56 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c 
> b/drivers/gpu/drm/i915/display/intel_fbc.c
> index 24b72951ea3c..e2e55c58ddbc 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbc.c
> @@ -127,6 +127,9 @@ struct intel_fbc {
>        */
>       struct intel_fbc_state state;
>       const char *no_fbc_reason;
> +
> +     /* Only one of FBC instances can use the system cache */
> +     bool own_sys_cache;
>  };
>  
>  static char fbc_name(enum intel_fbc_id fbc_id)
> @@ -569,12 +572,51 @@ static bool ilk_fbc_is_compressing(struct intel_fbc 
> *fbc)
>       return intel_de_read(fbc->display, ILK_DPFC_STATUS(fbc->id)) & 
> DPFC_COMP_SEG_MASK;
>  }
>  
> +static void nvl_fbc_program_system_cache(struct intel_fbc *fbc, bool enable)
> +{
> +     struct intel_display *display = fbc->display;
> +     u32 cfb_offset, usage;
> +
> +     lockdep_assert_held(&fbc->lock);
> +
> +     usage = intel_de_read(display, NVL_FBC_SYS_CACHE_USAGE_CFG);
> +
> +     /* System cache already being used by another pipe */
> +     if (enable && (usage & FBC_SYS_CACHE_TAG_USE_RES_SPACE))
> +             return;

Rather than relying on the current register contents, should we be
sanitizing this on driver probe (in case the pre-OS firmware already set
this up) and then making our own decisions (as part of an atomic
transaction) about which pipe to prioritize after that?

> +
> +     /* Only the fbc instance which owns system cache can disable it */
> +     if (!enable && !fbc->own_sys_cache)
> +             return;
> +
> +     /*
> +      * Not programming the cache limit and cache reading enable bits 
> explicitly
> +      * here. The default values should take care of those and that could 
> leave
> +      * adjustments of those bits to the system hw policy
> +      *
> +      * TODO: check if we need to explicitly program these?

There's no hardware default documented for the range field, so unless
the pre-OS firmware sets it up (which we probably shouldn't rely on),
I'd expect this to be 0; I don't think that's what we want.

> +      */
> +     cfb_offset = enable ? i915_gem_stolen_node_offset(fbc->compressed_fb) : 
> 0;
> +     usage |= FBC_SYS_CACHE_START_BASE(cfb_offset);

And if something *did* set this up already, then OR'ing a new address
over the old one isn't going to work.  We'd need "(old & ~mask) | new"
to ensure we don't have leftover bits still set by accident.  But it
would probably be better to just avoid RMW-style handling in general and
build a complete register value with exactly what we want rather than
trying to modify the pre-existing value.

> +     usage |= enable ? FBC_SYS_CACHE_TAG_USE_RES_SPACE : 
> FBC_SYS_CACHE_TAG_DONT_CACHE;
> +
> +     intel_de_write(display, NVL_FBC_SYS_CACHE_USAGE_CFG, usage);
> +
> +     fbc->own_sys_cache = enable;

It feels like instead of having this as a boolean flag in fbc, this
should be a pointer/ID tracked at the intel_display level.  E.g.,

        display->sys_cache_fbc = fbc;

or possibly converted over to something tracked with atomic state so
that we can make better high-level decisions about which FBC we want to
enable this on as various displays get enabled/disabled.


Matt

> +
> +     drm_dbg_kms(display->drm, "System caching for FBC[%d] %s\n",
> +                 fbc->id, enable ? "configured" : "cleared");
> +}
> +
>  static void ilk_fbc_program_cfb(struct intel_fbc *fbc)
>  {
>       struct intel_display *display = fbc->display;
>  
>       intel_de_write(display, ILK_DPFC_CB_BASE(fbc->id),
>                      i915_gem_stolen_node_offset(fbc->compressed_fb));
> +
> +     if (DISPLAY_VER(display) >= 35)
> +             nvl_fbc_program_system_cache(fbc, true);
>  }
>  
>  static const struct intel_fbc_funcs ilk_fbc_funcs = {
> @@ -952,6 +994,8 @@ static void intel_fbc_program_workarounds(struct 
> intel_fbc *fbc)
>  
>  static void __intel_fbc_cleanup_cfb(struct intel_fbc *fbc)
>  {
> +     struct intel_display *display = fbc->display;
> +
>       if (WARN_ON(intel_fbc_hw_is_active(fbc)))
>               return;
>  
> @@ -959,6 +1003,9 @@ static void __intel_fbc_cleanup_cfb(struct intel_fbc 
> *fbc)
>               i915_gem_stolen_remove_node(fbc->compressed_llb);
>       if (i915_gem_stolen_node_allocated(fbc->compressed_fb))
>               i915_gem_stolen_remove_node(fbc->compressed_fb);
> +
> +     if (DISPLAY_VER(display) >= 35)
> +             nvl_fbc_program_system_cache(fbc, false);
>  }
>  
>  void intel_fbc_cleanup(struct intel_display *display)
> diff --git a/drivers/gpu/drm/i915/display/intel_fbc_regs.h 
> b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
> index 77d8321c4fb3..592cd2384255 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbc_regs.h
> +++ b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
> @@ -128,4 +128,13 @@
>  #define   FBC_REND_NUKE                      REG_BIT(2)
>  #define   FBC_REND_CACHE_CLEAN               REG_BIT(1)
>  
> +#define NVL_FBC_SYS_CACHE_USAGE_CFG             _MMIO(0x1344E0)
> +#define   FBC_SYS_CACHE_START_BASE_MASK         REG_GENMASK(31, 16)
> +#define   FBC_SYS_CACHE_START_BASE(base)        
> REG_FIELD_PREP(FBC_SYS_CACHE_START_BASE_MASK, (base))
> +#define   FBC_SYS_CACHEABLE_RANGE_MASK          REG_GENMASK(15, 4)
> +#define   FBC_SYS_CACHEABLE_RANGE(range)        
> REG_FIELD_PREP(FBC_SYS_CACHEABLE_RANGE_MASK, (range))
> +#define   FBC_SYS_CACHE_TAG_MASK                REG_GENMASK(3, 2)
> +#define   FBC_SYS_CACHE_TAG_DONT_CACHE          
> REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 0)
> +#define   FBC_SYS_CACHE_TAG_USE_RES_SPACE       
> REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 3)
> +
>  #endif /* __INTEL_FBC_REGS__ */
> 
> -- 
> 2.51.0
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

Reply via email to