Chris Wilson <ch...@chris-wilson.co.uk> writes:

> Continuing the fun of trying to find exactly the delay that is
> sufficient to ensure that the page directory is fully loaded between
> context switches, move the extra flush added in commit 70b73f9ac113
> ("drm/i915/ringbuffer: Delay after invalidating gen6+ xcs") to just
> after we flush the pd. Entirely based on the empirical data of running
> failing tests in a loop until we survive a day (before the mtbf is 10-30
> minutes).
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107769
> References: 70b73f9ac113 ("drm/i915/ringbuffer: Delay after invalidating 
> gen6+ xcs")
> Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 40 ++++++++++++++-----------
>  1 file changed, 22 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 86604dd1c5a5..472939f5c18f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1707,9 +1707,29 @@ static int switch_context(struct i915_request *rq)
>       }
>  
>       if (ppgtt) {
> +             ret = engine->emit_flush(rq, EMIT_INVALIDATE);
> +             if (ret)
> +                     goto err_mm;
> +
>               ret = flush_pd_dir(rq);
>               if (ret)
>                       goto err_mm;
> +
> +             /*
> +              * Not only do we need a full barrier (post-sync write) after
> +              * invalidating the TLBs, but we need to wait a little bit
> +              * longer. Whether this is merely delaying us, or the
> +              * subsequent flush is a key part of serialising with the
> +              * post-sync op, this extra pass appears vital before a
> +              * mm switch!
> +              */
> +             ret = engine->emit_flush(rq, EMIT_INVALIDATE);
> +             if (ret)
> +                     goto err_mm;
> +
> +             ret = engine->emit_flush(rq, EMIT_FLUSH);
> +             if (ret)
> +                     goto err_mm;

Someone said that proof is in the pudding. Just could
be more fun if someone would show us the recipe.

Acked-by: Mika Kuoppala <mika.kuopp...@linux.intel.com>


>       }
>  
>       if (ctx->remap_slice) {
> @@ -1947,7 +1967,7 @@ static void gen6_bsd_submit_request(struct i915_request 
> *request)
>       intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>  }
>  
> -static int emit_mi_flush_dw(struct i915_request *rq, u32 flags)
> +static int mi_flush_dw(struct i915_request *rq, u32 flags)
>  {
>       u32 cmd, *cs;
>  
> @@ -1985,23 +2005,7 @@ static int emit_mi_flush_dw(struct i915_request *rq, 
> u32 flags)
>  
>  static int gen6_flush_dw(struct i915_request *rq, u32 mode, u32 invflags)
>  {
> -     int err;
> -
> -     /*
> -      * Not only do we need a full barrier (post-sync write) after
> -      * invalidating the TLBs, but we need to wait a little bit
> -      * longer. Whether this is merely delaying us, or the
> -      * subsequent flush is a key part of serialising with the
> -      * post-sync op, this extra pass appears vital before a
> -      * mm switch!
> -      */
> -     if (mode & EMIT_INVALIDATE) {
> -             err = emit_mi_flush_dw(rq, invflags);
> -             if (err)
> -                     return err;
> -     }
> -
> -     return emit_mi_flush_dw(rq, 0);
> +     return mi_flush_dw(rq, mode & EMIT_INVALIDATE ? invflags : 0);
>  }
>  
>  static int gen6_bsd_ring_flush(struct i915_request *rq, u32 mode)
> -- 
> 2.19.0.rc1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to