i915: Transform context WAs into static tables

Oscar Mateo Mon, 06 Nov 2017 10:54:03 -0800


On 11/06/2017 03:59 AM, Joonas Lahtinen wrote:

On Fri, 2017-11-03 at 11:09 -0700, Oscar Mateo wrote:

This is for WAs that need to touch registers that get saved/restored
together with the logical context. The idea is that WAs are "pretty"
static, so a table is more declarative than a programmatic approah.
Note however that some amount is caching is needed for those things
that are dynamic (e.g. things that need some calculation, or have
a criteria different than the more obvious GEN + stepping).

Also, this makes very explicit which WAs live in the context.

Suggested-by: Joonas Lahtinen <joonas.lahti...@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.ma...@intel.com>
Cc: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuopp...@linux.intel.com>

<SNIP>

+struct i915_wa_reg;
+
+typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
+                                 struct i915_wa_reg *wa);
+typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
+                                  struct i915_wa_reg *wa);

To avoid carrying any variables over, how about just apply() hook?
Also, you don't have to have "_hook" going there, it's tak

Not all WAs are applied in the same way: ctx-style workarounds areemitted as LRI commands to the ring. Do you treat those differently?

  struct i915_wa_reg {
+       const char *name;

We may want some Kconfig option for skipping these.

Sure. But we should try to decide first if we want to store this at all,like: what do we expect to use this for? is it worth it?

+       enum wa_type {
+               I915_WA_TYPE_CONTEXT = 0,
+               I915_WA_TYPE_GT,
+               I915_WA_TYPE_DISPLAY,
+               I915_WA_TYPE_WHITELIST
+       } type;
+

Any specific reason not to have the gen here too? Then you can have one
big table, instead of tables of tables. Then the numeric code of a WA
(position in that table) would be equally identifying it compared to
the WA name (which is nice to have information, so config time opt-in).

Such a "big table" would be quite big, indeed. And we know we want toapply the workarounds from at least four different places, so loopingthrough the table each and every time to find the relevant WAs seemslike a waste. Also, in some places we would have to loop more than once( to know the number of WAs to apply before we can reserve space in thering for ctx-style WAs, for example).

I could also go for 4 slightly smaller tables (one per type of WA) butthen there is another problem to solve: how do you record WAs that applyfor all revisions of one GEN, but a smaller number of revisions ofanother? (e.g. WaDisableFenceDestinationToSLM applies to all BDWsteppings but only KBL A0).

+       u8 since;
+       u8 until;

Most seem to have ALL_REVS, so this could be after the coarse-grained
gen-check in the apply function.

So every single WA that applies to specific REVS gets an "apply"function? That looks like a lot of functions (I count 25 WAs that onlyapply to some steppings already). Or are you simply saying here that Icheck the GEN before checking the stepping (which is the only order thatmakes sense anyway)?

+
        i915_reg_t addr;
-       u32 value;
-       /* bitmask representing WA bits */
        u32 mask;
+       u32 value;
+       bool is_masked_reg;

I'd hide this detail into the apply function.


I see. But if you don't store the mask: what do you output in debugfs?

+
+       wa_pre_hook_func pre_hook;
+       wa_post_hook_func post_hook;

        bool (*apply)(const struct i915_wa *wa,
                      struct drm_i915_private *dev_priv);

+       u32 hook_data;
+       bool applied;

The big point would be to make this into const, so "applied" would
defeat that.

Yeah, I realized. Keeping a separate bitmask of which WAs have beenapplied is not a big deal, but then I became aware that there are manymore things that would need to be cached. For example, some WAs requireto compute the actual value you write into their register. What do youdo with those? (remember that you still want to print the expected valuein debugfs for these).

<SNIP>

+#define MASK(mask, value)      ((mask) << 16 | (value))
+#define MASK_ENABLE(x)         (MASK((x), (x)))
+#define MASK_DISABLE(x)                (MASK((x), 0))

-#define WA_REG(addr, mask, val) do { \

-               const int r = wa_add(dev_priv, (addr), (mask), (val)); \
-               if (r) \
-                       return r; \
-       } while (0)
+#define SET_BIT_MASKED(m)              \
+       .mask = (m),                    \
+       .value = MASK_ENABLE(m),        \
+       .is_masked_reg = true

-#define WA_SET_BIT_MASKED(addr, mask) \

-       WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+#define CLEAR_BIT_MASKED( m)           \
+       .mask = (m),                    \
+       .value = MASK_DISABLE(m),       \
+       .is_masked_reg = true

-#define WA_CLR_BIT_MASKED(addr, mask) \

-       WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+#define SET_FIELD_MASKED(m, v)                 \
+       .mask = (m),                    \
+       .value = MASK(m, v),            \
+       .is_masked_reg = true

Lets try to have the struct i915_wa as small as possible, so this could
be calculated in the apply function.

So, avoiding the macros this would indeed become rather declarative;

{
        WA_NAME("WaDisableAsyncFlipPerfMode")
        .gen = ...,
        .reg = MI_MODE,
        .value = ASYNC_FLIP_PERF_DISABLE,
        .apply = set_bit_masked,
},
Or, we could also have;

static const struct i915_wa WaDisableAsyncFlipPerfMode = {
        .gen = ...,
        .reg = MI_MODE,
        .value = ASYNC_FLIP_PERF_DISABLE,
        .apply = set_bit_masked,
};

And then one array of those.

        WA(WaDisableAsyncFlipPerfMode),

This is the list of problems we need to solve before we can go forwardwith this design:

- What to do with WAs that don't know a priori what .value should be,because it gets computed in places like skl_tune_iz_hashing oruse_gtt_cache? (yes, computing in the apply function is the immediateanswer, but then... how do you output that in debugfs?).- What to do with context-style WAs, that are emitted instead ofapplied, as I mentioned above?.- What to do with whitelist-style functions, where you need to accessthe .reg field of i915_reg_t to know the .value? Also, the .reg dependson the engine (although I guess you can always statically codify that inthe table and apply the whitelist WAs later, once all the engines are up).- You are not storing .since/.until. Does that mean every WA thatapplies to only some steppings gets a custom apply function?.- If you don't store the computed mask anywhere, what do you output indebugfs? (which is the real improvement we want to achieve?).- Something to be careful about: some WAs are named the same, but theirreg/value is different (because the register has changed in oneparticular GEN or whatever). The solution could be a modifier to thename (WaSomething_bdw_chv and WaSomething_skl) but this could be asource of errors.

Then you could at compile time decide if you stringify and store the
name. But that'd be more const data than necessary (pointers to
structs, instead of an array of structs).

Regards, Joonas

One more thing: I still urge to reconsider merging what we already have,and doing these improvements (once we agree on a design) later on. Thereason being that the sooner we get a list of all WAs in debugfs, thebetter (which can be used later on to verify any further improvements wedo).


Thanks for the review,
Oscar

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables

Reply via email to