On Mon, Sep 22, 2025 at 03:38:46PM +0300, Luca Coelho wrote: > On Fri, 2025-09-19 at 11:36 +0300, Ville Syrjälä wrote: > > On Mon, Sep 08, 2025 at 10:35:33AM +0300, Luca Coelho wrote: > > > There are currently two booleans to define three tiling modes, which > > > is bad practice because it allows representing an invalid mode. In > > > order to simplify this, convert these two booleans into one > > > enumeration with three possible tiling modes. > > > > > > Additionally, introduce the concept of Y "family" of tiling, which > > > groups Y, Yf and 4 tiling, since they're effectively treated in the > > > same way in the watermark calculations. Describe the grouping in the > > > enumeration definition. > > > > > > Signed-off-by: Luca Coelho <[email protected]> > > > --- > > > drivers/gpu/drm/i915/display/skl_watermark.c | 35 ++++++++++++++------ > > > 1 file changed, 24 insertions(+), 11 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c > > > b/drivers/gpu/drm/i915/display/skl_watermark.c > > > index 0ce3420a919e..dd4bed02c3c0 100644 > > > --- a/drivers/gpu/drm/i915/display/skl_watermark.c > > > +++ b/drivers/gpu/drm/i915/display/skl_watermark.c > > > @@ -53,9 +53,16 @@ struct intel_dbuf_state { > > > #define intel_atomic_get_new_dbuf_state(state) \ > > > to_intel_dbuf_state(intel_atomic_get_new_global_obj_state(state, > > > &to_intel_display(state)->dbuf.obj)) > > > > > > +/* Tiling mode groups relevant to WM calculations */ > > > +enum wm_tiling_mode { > > > + WM_TILING_LINEAR, > > > + WM_TILING_X_TILED, /* mostly like linear */ > > > > The _TILED suffix seems redundant here. > > Indeed. I'll remove. I just wanted to differentiate the actually > tiled ones from linear, but that's moot. > > > > > + WM_TILING_Y_FAMILY, /* includes Y, Yf and 4 tiling */ > > > > I don't really like the "y family" invention. Doesn't really > > unconfuse anything for the reader without going back to have > > a look at the comment. > > > > I think it would be better to just spell out each tilimg mode. > > So I guess something like "WM_TILING_Y_Yf_4" > > Yeah, I wasn't entirely happy with "family", but I really couldn't find > any better term. My idea was to make it generic enough so we wouldn't > have to add a new tiling to the symbol every time we add something new. > Which is what happened with the "Y_TILED" before, and included also Yf > and 4 without any reference to this in the code. It confused the crap > out of me. > > Anyway, your idea is definitely clearer, so I'll change this. > > > > > +}; > > > + > > > /* Stores plane specific WM parameters */ > > > struct skl_wm_params { > > > - bool x_tiled, y_tiled; > > > + enum wm_tiling_mode tiling; > > > > That'll now be 4 bytes. > > > > > bool rc_surface; > > > bool is_planar; > > > > and we'll have a two byte hole here. > > > > > u32 width; > > u8 cpp; > > > > And there's a 3 byte hole already here after the cpp. > > Should group the u8 with the bools to avoid so many holes. > > > > We could also shrink y_min_scanlines to a u8 and > > stick it into the last 1 byte hole. That'd shrink the whole > > struct by 4 bytes. > > > > dbuf_block_size would also fit in a u16, but doesn't look > > like we have any other holes where we could stick it. Hmm, > > actually 'width' could probably also be shrunk to be a u16. > > So could get rid of another 4 bytes here if we really > > wanted to. > > > > But I suppose all that repacking should be a separate patch... > > Okay, I'll move the two remaining bools below the u8, so we fill part > of that existing space. And I'll add a separate patch to reduce the > y_min_scanlines to u8 and fill the existing hole. > > I'll leave the dbuf_block_size and width change out for now. I think 4 > bytes extra saving will not be worth the trouble, but I'll keep this in > mind if I eventually encounter other changes to be made with these > elements. > > > > > @@ -618,7 +625,8 @@ static unsigned int skl_wm_latency(struct > > > intel_display *display, int level, > > > display->platform.cometlake) && skl_watermark_ipc_enabled(display)) > > > latency += 4; > > > > > > - if (skl_needs_memory_bw_wa(display) && wp && wp->x_tiled) > > > + if (skl_needs_memory_bw_wa(display) && > > > + wp && wp->tiling == WM_TILING_X_TILED) > > > latency += 15; > > > > > > return latency; > > > @@ -1674,9 +1682,14 @@ skl_compute_wm_params(const struct > > > intel_crtc_state *crtc_state, > > > return -EINVAL; > > > } > > > > > > - wp->x_tiled = modifier == I915_FORMAT_MOD_X_TILED; > > > - wp->y_tiled = modifier != I915_FORMAT_MOD_X_TILED && > > > - intel_fb_is_tiled_modifier(modifier); > > > + if (modifier == I915_FORMAT_MOD_X_TILED) > > > + wp->tiling = WM_TILING_X_TILED; > > > + else if (modifier != I915_FORMAT_MOD_X_TILED && > > > > The modifier check here is redundant with the if-else construct. > > Indeed. > > > > > + intel_fb_is_tiled_modifier(modifier)) > > > + wp->tiling = WM_TILING_Y_FAMILY; > > > + else > > > + wp->tiling = WM_TILING_LINEAR; > > > > In fact we can avoid the entire intel_fb_is_tiled_modifier() > > call with something like: > > > > if (mod == LINEAR) > > tiling = LINEAR; > > else if (mod == X) > > tiling = X; > > else > > tiling = Y_Yf_4; > > Nice, I'll change it. > > > > The wm code always pops up fairly high in cpu profiles, so > > anything that makes it lighter is worth considering. > > Oh, that's good to know. But this is not happening in the "datapath", > but during modesets, right? Do the modesets really take so long and > happen so often so that avoiding a CPU cycles makes much difference?
This happens on every flip. While not a big deal perhaps for sync flips @60Hz, but with async flips and/or high refresh rates it starts to matter more. -- Ville Syrjälä Intel
