Hi Matt,
On Mon, 2025-11-03 at 16:15 -0800, Matt Roper wrote: > On Mon, Nov 03, 2025 at 02:18:12PM -0300, Gustavo Sousa wrote: > > From: Vinod Govindapillai <[email protected]> > > > > Configure one of the FBC instances to use system caching. FBC > > read/write requests are tagged as cacheable till a programmed > > limit is reached by the hw. > > What exactly is "system caching?" We have lots of different caches > in > current platforms, and it's not really obvious to me from the > description here (or the bspec page) exactly which cache(s) are > involved > here. > > Is turning this on always a win or is it situational? I.e., is there > any potential for display memory traffic to fill a cache with FBC > data > by evicting data that was part of the CPU or GT's working set? If > so, > that seems like it could potentially harm the performance of other > workloads running on the platform. > > Or is this whole thing about a completely new cache (unrelated to > and unusable by anything else) which is devoted solely to FBC? > > > > > Bspec: 74722 > > You might want to add 68881 here since it has a bit more information > about how we're actually supposed to set the fields documented on > 74722. Okay I will include that. I guess, the HAS also talks about "system cache" - no further explanation. But only a fixed portion is allocated specifically for the display use and is "configurable". Motivation is to reduce to memory subsystem power especially in idle scenarios. > > > Signed-off-by: Vinod Govindapillai <[email protected]> > > Signed-off-by: Gustavo Sousa <[email protected]> > > --- > > drivers/gpu/drm/i915/display/intel_fbc.c | 47 > > +++++++++++++++++++++++++++ > > drivers/gpu/drm/i915/display/intel_fbc_regs.h | 9 +++++ > > 2 files changed, 56 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c > > b/drivers/gpu/drm/i915/display/intel_fbc.c > > index 24b72951ea3c..e2e55c58ddbc 100644 > > --- a/drivers/gpu/drm/i915/display/intel_fbc.c > > +++ b/drivers/gpu/drm/i915/display/intel_fbc.c > > @@ -127,6 +127,9 @@ struct intel_fbc { > > */ > > struct intel_fbc_state state; > > const char *no_fbc_reason; > > + > > + /* Only one of FBC instances can use the system cache */ > > + bool own_sys_cache; > > }; > > > > static char fbc_name(enum intel_fbc_id fbc_id) > > @@ -569,12 +572,51 @@ static bool ilk_fbc_is_compressing(struct > > intel_fbc *fbc) > > return intel_de_read(fbc->display, ILK_DPFC_STATUS(fbc- > > >id)) & DPFC_COMP_SEG_MASK; > > } > > > > +static void nvl_fbc_program_system_cache(struct intel_fbc *fbc, > > bool enable) > > +{ > > + struct intel_display *display = fbc->display; > > + u32 cfb_offset, usage; > > + > > + lockdep_assert_held(&fbc->lock); > > + > > + usage = intel_de_read(display, > > NVL_FBC_SYS_CACHE_USAGE_CFG); > > + > > + /* System cache already being used by another pipe */ > > + if (enable && (usage & FBC_SYS_CACHE_TAG_USE_RES_SPACE)) > > + return; > > Rather than relying on the current register contents, should we be > sanitizing this on driver probe (in case the pre-OS firmware already > set > this up) and then making our own decisions (as part of an atomic > transaction) about which pipe to prioritize after that? > > > + > > + /* Only the fbc instance which owns system cache can > > disable it */ > > + if (!enable && !fbc->own_sys_cache) > > + return; > > + > > + /* > > + * Not programming the cache limit and cache reading > > enable bits explicitly > > + * here. The default values should take care of those and > > that could leave > > + * adjustments of those bits to the system hw policy > > + * > > + * TODO: check if we need to explicitly program these? > > There's no hardware default documented for the range field, so unless > the pre-OS firmware sets it up (which we probably shouldn't rely on), > I'd expect this to be 0; I don't think that's what we want. The Bspec says it is 2MB. But according to the HAS it is "configurable" and I clarified that this is at the moment 2MB but can change. So I read it as something already configured and set as the default value to the register and it could be changed by the soc policy. Thats the reason I thought it be kept untouched. Will forward on email conversation I had. > > > + */ > > + cfb_offset = enable ? i915_gem_stolen_node_offset(fbc- > > >compressed_fb) : 0; > > + usage |= FBC_SYS_CACHE_START_BASE(cfb_offset); > > And if something *did* set this up already, then OR'ing a new address > over the old one isn't going to work. We'd need "(old & ~mask) | > new" > to ensure we don't have leftover bits still set by accident. But it > would probably be better to just avoid RMW-style handling in general > and > build a complete register value with exactly what we want rather than > trying to modify the pre-existing value. > > > + usage |= enable ? FBC_SYS_CACHE_TAG_USE_RES_SPACE : > > FBC_SYS_CACHE_TAG_DONT_CACHE; > > + > > + intel_de_write(display, NVL_FBC_SYS_CACHE_USAGE_CFG, > > usage); > > + > > + fbc->own_sys_cache = enable; Okay. Thanks. Will fix that! > > It feels like instead of having this as a boolean flag in fbc, this > should be a pointer/ID tracked at the intel_display level. E.g., > > display->sys_cache_fbc = fbc; > > or possibly converted over to something tracked with atomic state so > that we can make better high-level decisions about which FBC we want > to > enable this on as various displays get enabled/disabled. Okay. Will check this and get rid of the bool from the intel_fbc structure! At the moment we can allocate only based on the firt pipe enabling the fbc. But may be in future we could have some logic for this I guess. > Matt > > > + > > + drm_dbg_kms(display->drm, "System caching for FBC[%d] > > %s\n", > > + fbc->id, enable ? "configured" : "cleared"); > > +} > > + > > static void ilk_fbc_program_cfb(struct intel_fbc *fbc) > > { > > struct intel_display *display = fbc->display; > > > > intel_de_write(display, ILK_DPFC_CB_BASE(fbc->id), > > i915_gem_stolen_node_offset(fbc- > > >compressed_fb)); > > + > > + if (DISPLAY_VER(display) >= 35) > > + nvl_fbc_program_system_cache(fbc, true); > > } > > > > static const struct intel_fbc_funcs ilk_fbc_funcs = { > > @@ -952,6 +994,8 @@ static void > > intel_fbc_program_workarounds(struct intel_fbc *fbc) > > > > static void __intel_fbc_cleanup_cfb(struct intel_fbc *fbc) > > { > > + struct intel_display *display = fbc->display; > > + > > if (WARN_ON(intel_fbc_hw_is_active(fbc))) > > return; > > > > @@ -959,6 +1003,9 @@ static void __intel_fbc_cleanup_cfb(struct > > intel_fbc *fbc) > > i915_gem_stolen_remove_node(fbc->compressed_llb); > > if (i915_gem_stolen_node_allocated(fbc->compressed_fb)) > > i915_gem_stolen_remove_node(fbc->compressed_fb); > > + > > + if (DISPLAY_VER(display) >= 35) > > + nvl_fbc_program_system_cache(fbc, false); > > } > > > > void intel_fbc_cleanup(struct intel_display *display) > > diff --git a/drivers/gpu/drm/i915/display/intel_fbc_regs.h > > b/drivers/gpu/drm/i915/display/intel_fbc_regs.h > > index 77d8321c4fb3..592cd2384255 100644 > > --- a/drivers/gpu/drm/i915/display/intel_fbc_regs.h > > +++ b/drivers/gpu/drm/i915/display/intel_fbc_regs.h > > @@ -128,4 +128,13 @@ > > #define FBC_REND_NUKE REG_BIT(2) > > #define FBC_REND_CACHE_CLEAN REG_BIT(1) > > > > +#define NVL_FBC_SYS_CACHE_USAGE_CFG _MMIO(0x1344E0) > > +#define FBC_SYS_CACHE_START_BASE_MASK REG_GENMASK(31, > > 16) > > +#define FBC_SYS_CACHE_START_BASE(base) > > REG_FIELD_PREP(FBC_SYS_CACHE_START_BASE_MASK, (base)) > > +#define FBC_SYS_CACHEABLE_RANGE_MASK REG_GENMASK(15, 4) > > +#define FBC_SYS_CACHEABLE_RANGE(range) > > REG_FIELD_PREP(FBC_SYS_CACHEABLE_RANGE_MASK, (range)) > > +#define FBC_SYS_CACHE_TAG_MASK REG_GENMASK(3, 2) > > +#define FBC_SYS_CACHE_TAG_DONT_CACHE > > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 0) > > +#define FBC_SYS_CACHE_TAG_USE_RES_SPACE > > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 3) > > + > > #endif /* __INTEL_FBC_REGS__ */ > > > > -- > > 2.51.0 > > >
