On Mon, Sep 22, 2025 at 03:38:46PM +0300, Luca Coelho wrote:
> On Fri, 2025-09-19 at 11:36 +0300, Ville Syrjälä wrote:
> > On Mon, Sep 08, 2025 at 10:35:33AM +0300, Luca Coelho wrote:
> > > There are currently two booleans to define three tiling modes, which
> > > is bad practice because it allows representing an invalid mode.  In
> > > order to simplify this, convert these two booleans into one
> > > enumeration with three possible tiling modes.
> > > 
> > > Additionally, introduce the concept of Y "family" of tiling, which
> > > groups Y, Yf and 4 tiling, since they're effectively treated in the
> > > same way in the watermark calculations.  Describe the grouping in the
> > > enumeration definition.
> > > 
> > > Signed-off-by: Luca Coelho <[email protected]>
> > > ---
> > >  drivers/gpu/drm/i915/display/skl_watermark.c | 35 ++++++++++++++------
> > >  1 file changed, 24 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c 
> > > b/drivers/gpu/drm/i915/display/skl_watermark.c
> > > index 0ce3420a919e..dd4bed02c3c0 100644
> > > --- a/drivers/gpu/drm/i915/display/skl_watermark.c
> > > +++ b/drivers/gpu/drm/i915/display/skl_watermark.c
> > > @@ -53,9 +53,16 @@ struct intel_dbuf_state {
> > >  #define intel_atomic_get_new_dbuf_state(state) \
> > >   to_intel_dbuf_state(intel_atomic_get_new_global_obj_state(state, 
> > > &to_intel_display(state)->dbuf.obj))
> > >  
> > > +/* Tiling mode groups relevant to WM calculations */
> > > +enum wm_tiling_mode {
> > > + WM_TILING_LINEAR,
> > > + WM_TILING_X_TILED,      /* mostly like linear */
> > 
> > The _TILED suffix seems redundant here.
> 
> Indeed.  I'll remove.  I just wanted to differentiate the actually
> tiled ones from linear, but that's moot.
> 
> 
> > > + WM_TILING_Y_FAMILY,     /* includes Y, Yf and 4 tiling */
> > 
> > I don't really like the "y family" invention. Doesn't really
> > unconfuse anything for the reader without going back to have
> > a look at the comment.
> > 
> > I think it would be better to just spell out each tilimg mode.
> > So I guess something like "WM_TILING_Y_Yf_4"
> 
> Yeah, I wasn't entirely happy with "family", but I really couldn't find
> any better term.  My idea was to make it generic enough so we wouldn't
> have to add a new tiling to the symbol every time we add something new.
> Which is what happened with the "Y_TILED" before, and included also Yf
> and 4 without any reference to this in the code.  It confused the crap
> out of me.
> 
> Anyway, your idea is definitely clearer, so I'll change this.
> 
> 
> > > +};
> > > +
> > >  /* Stores plane specific WM parameters */
> > >  struct skl_wm_params {
> > > - bool x_tiled, y_tiled;
> > > + enum wm_tiling_mode tiling;
> > 
> > That'll now be 4 bytes.
> > 
> > >   bool rc_surface;
> > >   bool is_planar;
> > 
> > and we'll have a two byte hole here.
> > 
> > >   u32 width;
> >         u8 cpp;
> > 
> > And there's a 3 byte hole already here after the cpp.
> > Should group the u8 with the bools to avoid so many holes.
> > 
> > We could also shrink y_min_scanlines to a u8 and
> > stick it into the last 1 byte hole. That'd shrink the whole
> > struct by 4 bytes.
> > 
> > dbuf_block_size would also fit in a u16, but doesn't look
> > like we have any other holes where we could stick it. Hmm,
> > actually 'width' could probably also be shrunk to be a u16.
> > So could get rid of another 4 bytes here if we really
> > wanted to.
> > 
> > But I suppose all that repacking should be a separate patch...
> 
> Okay, I'll move the two remaining bools below the u8, so we fill part
> of that existing space.  And I'll add a separate patch to reduce the
> y_min_scanlines to u8 and fill the existing hole.
> 
> I'll leave the dbuf_block_size and width change out for now.  I think 4
> bytes extra saving will not be worth the trouble, but I'll keep this in
> mind if I eventually encounter other changes to be made with these
> elements.
> 
> 
> > > @@ -618,7 +625,8 @@ static unsigned int skl_wm_latency(struct 
> > > intel_display *display, int level,
> > >        display->platform.cometlake) && skl_watermark_ipc_enabled(display))
> > >           latency += 4;
> > >  
> > > - if (skl_needs_memory_bw_wa(display) && wp && wp->x_tiled)
> > > + if (skl_needs_memory_bw_wa(display) &&
> > > +     wp && wp->tiling == WM_TILING_X_TILED)
> > >           latency += 15;
> > >  
> > >   return latency;
> > > @@ -1674,9 +1682,14 @@ skl_compute_wm_params(const struct 
> > > intel_crtc_state *crtc_state,
> > >           return -EINVAL;
> > >   }
> > >  
> > > - wp->x_tiled = modifier == I915_FORMAT_MOD_X_TILED;
> > > - wp->y_tiled = modifier != I915_FORMAT_MOD_X_TILED &&
> > > -         intel_fb_is_tiled_modifier(modifier);
> > > + if (modifier == I915_FORMAT_MOD_X_TILED)
> > > +         wp->tiling = WM_TILING_X_TILED;
> > > + else if (modifier != I915_FORMAT_MOD_X_TILED &&
> > 
> > The modifier check here is redundant with the if-else construct.
> 
> Indeed.
> 
> 
> > > +          intel_fb_is_tiled_modifier(modifier))
> > > +         wp->tiling = WM_TILING_Y_FAMILY;
> > > + else
> > > +         wp->tiling = WM_TILING_LINEAR;
> > 
> > In fact we can avoid the entire intel_fb_is_tiled_modifier()
> > call with something like:
> > 
> > if (mod == LINEAR)
> >     tiling = LINEAR;
> > else if (mod == X)
> >     tiling = X;
> > else
> >     tiling = Y_Yf_4;
> 
> Nice, I'll change it.
> 
> 
> > The wm code always pops up fairly high in cpu profiles, so
> > anything that makes it lighter is worth considering.
> 
> Oh, that's good to know.  But this is not happening in the "datapath",
> but during modesets, right? Do the modesets really take so long and
> happen so often so that avoiding a CPU cycles makes much difference?

This happens on every flip. While not a big deal perhaps for sync
flips @60Hz, but with async flips and/or high refresh rates it
starts to matter more.

-- 
Ville Syrjälä
Intel

Reply via email to