Re: [Intel-gfx] [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables
On 11/06/2017 03:59 AM, Joonas Lahtinen wrote: On Fri, 2017-11-03 at 11:09 -0700, Oscar Mateo wrote: This is for WAs that need to touch registers that get saved/restored together with the logical context. The idea is that WAs are "pretty" static, so a table is more declarative than a programmatic approah. Note however that some amount is caching is needed for those things that are dynamic (e.g. things that need some calculation, or have a criteria different than the more obvious GEN + stepping). Also, this makes very explicit which WAs live in the context. Suggested-by: Joonas Lahtinen Signed-off-by: Oscar Mateo Cc: Chris Wilson Cc: Mika Kuoppala +struct i915_wa_reg; + +typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv, + struct i915_wa_reg *wa); +typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv, + struct i915_wa_reg *wa); To avoid carrying any variables over, how about just apply() hook? Also, you don't have to have "_hook" going there, it's tak Not all WAs are applied in the same way: ctx-style workarounds are emitted as LRI commands to the ring. Do you treat those differently? struct i915_wa_reg { + const char *name; We may want some Kconfig option for skipping these. Sure. But we should try to decide first if we want to store this at all, like: what do we expect to use this for? is it worth it? + enum wa_type { + I915_WA_TYPE_CONTEXT = 0, + I915_WA_TYPE_GT, + I915_WA_TYPE_DISPLAY, + I915_WA_TYPE_WHITELIST + } type; + Any specific reason not to have the gen here too? Then you can have one big table, instead of tables of tables. Then the numeric code of a WA (position in that table) would be equally identifying it compared to the WA name (which is nice to have information, so config time opt-in). Such a "big table" would be quite big, indeed. And we know we want to apply the workarounds from at least four different places, so looping through the table each and every time to find the relevant WAs seems like a waste. Also, in some places we would have to loop more than once ( to know the number of WAs to apply before we can reserve space in the ring for ctx-style WAs, for example). I could also go for 4 slightly smaller tables (one per type of WA) but then there is another problem to solve: how do you record WAs that apply for all revisions of one GEN, but a smaller number of revisions of another? (e.g. WaDisableFenceDestinationToSLM applies to all BDW steppings but only KBL A0). + u8 since; + u8 until; Most seem to have ALL_REVS, so this could be after the coarse-grained gen-check in the apply function. So every single WA that applies to specific REVS gets an "apply" function? That looks like a lot of functions (I count 25 WAs that only apply to some steppings already). Or are you simply saying here that I check the GEN before checking the stepping (which is the only order that makes sense anyway)? + i915_reg_t addr; - u32 value; - /* bitmask representing WA bits */ u32 mask; + u32 value; + bool is_masked_reg; I'd hide this detail into the apply function. I see. But if you don't store the mask: what do you output in debugfs? + + wa_pre_hook_func pre_hook; + wa_post_hook_func post_hook; bool (*apply)(const struct i915_wa *wa, struct drm_i915_private *dev_priv); + u32 hook_data; + bool applied; The big point would be to make this into const, so "applied" would defeat that. Yeah, I realized. Keeping a separate bitmask of which WAs have been applied is not a big deal, but then I became aware that there are many more things that would need to be cached. For example, some WAs require to compute the actual value you write into their register. What do you do with those? (remember that you still want to print the expected value in debugfs for these). +#define MASK(mask, value) ((mask) << 16 | (value)) +#define MASK_ENABLE(x) (MASK((x), (x))) +#define MASK_DISABLE(x)(MASK((x), 0)) -#define WA_REG(addr, mask, val) do { \ - const int r = wa_add(dev_priv, (addr), (mask), (val)); \ - if (r) \ - return r; \ - } while (0) +#define SET_BIT_MASKED(m) \ + .mask = (m),\ + .value = MASK_ENABLE(m),\ + .is_masked_reg = true -#define WA_SET_BIT_MASKED(addr, mask) \ - WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask)) +#define CLEAR_BIT_MASKED( m) \ + .mask = (m),\ + .value = MASK_DISABLE(m), \ + .is_masked_reg = true -#define WA_CLR_BIT_MASKED(addr, mask) \ - WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask)) +#define SET_FIELD_MASKED(m, v) \
Re: [Intel-gfx] [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables
On Fri, 2017-11-03 at 11:09 -0700, Oscar Mateo wrote: > This is for WAs that need to touch registers that get saved/restored > together with the logical context. The idea is that WAs are "pretty" > static, so a table is more declarative than a programmatic approah. > Note however that some amount is caching is needed for those things > that are dynamic (e.g. things that need some calculation, or have > a criteria different than the more obvious GEN + stepping). > > Also, this makes very explicit which WAs live in the context. > > Suggested-by: Joonas Lahtinen > Signed-off-by: Oscar Mateo > Cc: Chris Wilson > Cc: Mika Kuoppala > +struct i915_wa_reg; > + > +typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv, > + struct i915_wa_reg *wa); > +typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv, > +struct i915_wa_reg *wa); To avoid carrying any variables over, how about just apply() hook? Also, you don't have to have "_hook" going there, it's tak > struct i915_wa_reg { > + const char *name; We may want some Kconfig option for skipping these. > + enum wa_type { > + I915_WA_TYPE_CONTEXT = 0, > + I915_WA_TYPE_GT, > + I915_WA_TYPE_DISPLAY, > + I915_WA_TYPE_WHITELIST > + } type; > + Any specific reason not to have the gen here too? Then you can have one big table, instead of tables of tables. Then the numeric code of a WA (position in that table) would be equally identifying it compared to the WA name (which is nice to have information, so config time opt-in). > + u8 since; > + u8 until; Most seem to have ALL_REVS, so this could be after the coarse-grained gen-check in the apply function. > + > i915_reg_t addr; > - u32 value; > - /* bitmask representing WA bits */ > u32 mask; > + u32 value; > + bool is_masked_reg; I'd hide this detail into the apply function. > + > + wa_pre_hook_func pre_hook; > + wa_post_hook_func post_hook; bool (*apply)(const struct i915_wa *wa, struct drm_i915_private *dev_priv); > + u32 hook_data; > + bool applied; The big point would be to make this into const, so "applied" would defeat that. > +#define MASK(mask, value)((mask) << 16 | (value)) > +#define MASK_ENABLE(x) (MASK((x), (x))) > +#define MASK_DISABLE(x) (MASK((x), 0)) > > -#define WA_REG(addr, mask, val) do { \ > - const int r = wa_add(dev_priv, (addr), (mask), (val)); \ > - if (r) \ > - return r; \ > - } while (0) > +#define SET_BIT_MASKED(m)\ > + .mask = (m),\ > + .value = MASK_ENABLE(m),\ > + .is_masked_reg = true > > -#define WA_SET_BIT_MASKED(addr, mask) \ > - WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask)) > +#define CLEAR_BIT_MASKED( m) \ > + .mask = (m),\ > + .value = MASK_DISABLE(m), \ > + .is_masked_reg = true > > -#define WA_CLR_BIT_MASKED(addr, mask) \ > - WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask)) > +#define SET_FIELD_MASKED(m, v) \ > + .mask = (m),\ > + .value = MASK(m, v),\ > + .is_masked_reg = true Lets try to have the struct i915_wa as small as possible, so this could be calculated in the apply function. So, avoiding the macros this would indeed become rather declarative; { WA_NAME("WaDisableAsyncFlipPerfMode") .gen = ..., .reg = MI_MODE, .value = ASYNC_FLIP_PERF_DISABLE, .apply = set_bit_masked, }, Or, we could also have; static const struct i915_wa WaDisableAsyncFlipPerfMode = { .gen = ..., .reg = MI_MODE, .value = ASYNC_FLIP_PERF_DISABLE, .apply = set_bit_masked, }; And then one array of those. WA(WaDisableAsyncFlipPerfMode), Then you could at compile time decide if you stringify and store the name. But that'd be more const data than necessary (pointers to structs, instead of an array of structs). Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 04/20] drm/i915: Transform context WAs into static tables
This is for WAs that need to touch registers that get saved/restored together with the logical context. The idea is that WAs are "pretty" static, so a table is more declarative than a programmatic approah. Note however that some amount is caching is needed for those things that are dynamic (e.g. things that need some calculation, or have a criteria different than the more obvious GEN + stepping). Also, this makes very explicit which WAs live in the context. Suggested-by: Joonas Lahtinen Signed-off-by: Oscar Mateo Cc: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/i915_debugfs.c | 40 +- drivers/gpu/drm/i915/i915_drv.h | 35 +- drivers/gpu/drm/i915/i915_gem_context.c | 4 - drivers/gpu/drm/i915/intel_workarounds.c | 754 +-- drivers/gpu/drm/i915/intel_workarounds.h | 4 +- 5 files changed, 466 insertions(+), 371 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 39883cd..12c4330 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -30,6 +30,7 @@ #include #include #include "intel_drv.h" +#include "intel_workarounds.h" #include "i915_guc_submission.h" static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node) @@ -3357,13 +3358,16 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused) static int i915_wa_registers(struct seq_file *m, void *unused) { - int i; - int ret; struct intel_engine_cs *engine; struct drm_i915_private *dev_priv = node_to_i915(m->private); struct drm_device *dev = &dev_priv->drm; struct i915_workarounds *workarounds = &dev_priv->workarounds; + const struct i915_wa_reg_table *wa_table; + uint table_count; enum intel_engine_id id; + int i, j, ret; + + intel_ctx_workarounds_get(dev_priv, &wa_table, &table_count); ret = mutex_lock_interruptible(&dev->struct_mutex); if (ret) @@ -3371,22 +3375,28 @@ static int i915_wa_registers(struct seq_file *m, void *unused) intel_runtime_pm_get(dev_priv); - seq_printf(m, "Workarounds applied: %d\n", workarounds->count); + seq_printf(m, "Workarounds applied: %d\n", workarounds->ctx_count); for_each_engine(engine, dev_priv, id) seq_printf(m, "HW whitelist count for %s: %d\n", engine->name, workarounds->hw_whitelist_count[id]); - for (i = 0; i < workarounds->count; ++i) { - i915_reg_t addr; - u32 mask, value, read; - bool ok; - - addr = workarounds->reg[i].addr; - mask = workarounds->reg[i].mask; - value = workarounds->reg[i].value; - read = I915_READ(addr); - ok = (value & mask) == (read & mask); - seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n", - i915_mmio_reg_offset(addr), value, mask, read, ok ? "OK" : "FAIL"); + + for (i = 0; i < table_count; i++) { + const struct i915_wa_reg *wa = wa_table[i].table; + + for (j = 0; j < wa_table[i].count; j++) { + u32 read; + bool ok; + + if (!wa[j].applied) + continue; + + read = I915_READ(wa[j].addr); + ok = (wa[j].value & wa[j].mask) == (read & wa[j].mask); + seq_printf(m, + "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s, name: %s\n", + i915_mmio_reg_offset(wa[j].addr), wa[j].value, + wa[j].mask, read, ok ? "OK" : "FAIL", wa[j].name); + } } intel_runtime_pm_put(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4a7325c..1c73fec 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1970,18 +1970,43 @@ struct i915_frontbuffer_tracking { unsigned flip_bits; }; +struct i915_wa_reg; + +typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv, + struct i915_wa_reg *wa); +typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv, + struct i915_wa_reg *wa); + struct i915_wa_reg { + const char *name; + enum wa_type { + I915_WA_TYPE_CONTEXT = 0, + I915_WA_TYPE_GT, + I915_WA_TYPE_DISPLAY, + I915_WA_TYPE_WHITELIST + } type; + + u8 since; + u8 until; + i915_reg_t addr; - u32 value; - /* bitmask representing WA bits */ u32 mask; + u32 value; + bool is_masked_reg; + + wa_pre_hook_func pre_hook; + wa_post_hook_func post_hook; + u32 ho