Re: [Intel-gfx] [PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

2021-08-17 Thread Daniel Vetter
On Tue, Aug 17, 2021 at 5:13 PM Matthew Brost  wrote:
> On Tue, Aug 17, 2021 at 11:21:27AM +0200, Daniel Vetter wrote:
> > On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote:
> > > Progagating errors to dependent fences is wrong, don't do it. Selftest
> > > in following patch exposes this bug.
> >
> > Please explain what "this bug" is, it's hard to read minds, especially at
> > a distance in spacetime :-)
> >
>
> Not a very good explaination.
>
> > > Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to 
> > > children on unhold")
> >
> > I think it would be better to outright revert this, instead of just
> > disabling it like this.
> >
>
> I tried revert and git did some really odd things that I couldn't
> resolve, hence the new patch.

If there's any conflict git just gives you your current code, and what
was there with the revert applied, with the block markers. Then it's
your job to manually apply that change.

Occasionally (when there's been ridiculous amounts of code movement)
it gets completely lost and puts these into very non-intuitive places.
In that case just delete it, keep the current code, and check what
change you're missing that needs to be manually reverted still. Also
sometimes there's a follow-up patch that you should revert first,
which makes the revert clean. In that case it's generally the right
thing to revert the follow-up first, and then apply your revert. Often
there's subtle functional dependencies hiding.
-Daniel

>
> > Also please cite the dma_fence error propagation revert from Jason:
> >
> > commit 93a2711cddd5760e2f0f901817d71c93183c3b87
> > Author: Jason Ekstrand 
> > Date:   Wed Jul 14 14:34:16 2021 -0500
> >
> > Revert "drm/i915: Propagate errors on awaiting already signaled fences"
> >
> > Maybe in full, if you need the justification.
> >
>
> Will site.
>
> > > Signed-off-by: Matthew Brost 
> > > Cc: 
> >
> > Unless "this bug" is some real world impact thing I wouldn't put cc:
> > stable on this.
>
> Got it.
>
> Matt
>
> > -Daniel
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
> > >  1 file changed, 4 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > index de5f9c86b9a4..cafb0608ffb4 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request 
> > > *rq)
> > > if (p->flags & I915_DEPENDENCY_WEAK)
> > > continue;
> > >
> > > -   /* Propagate any change in error status */
> > > -   if (rq->fence.error)
> > > -   i915_request_set_error_once(w, 
> > > rq->fence.error);
> > > -
> > > if (w->engine != rq->engine)
> > > continue;
> > >
> > > --
> > > 2.32.0
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

2021-08-17 Thread Matthew Brost
On Tue, Aug 17, 2021 at 11:21:27AM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote:
> > Progagating errors to dependent fences is wrong, don't do it. Selftest
> > in following patch exposes this bug.
> 
> Please explain what "this bug" is, it's hard to read minds, especially at
> a distance in spacetime :-)
> 

Not a very good explaination.

> > Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to 
> > children on unhold")
> 
> I think it would be better to outright revert this, instead of just
> disabling it like this.
>

I tried revert and git did some really odd things that I couldn't
resolve, hence the new patch.
 
> Also please cite the dma_fence error propagation revert from Jason:
> 
> commit 93a2711cddd5760e2f0f901817d71c93183c3b87
> Author: Jason Ekstrand 
> Date:   Wed Jul 14 14:34:16 2021 -0500
> 
> Revert "drm/i915: Propagate errors on awaiting already signaled fences"
> 
> Maybe in full, if you need the justification.
>

Will site.

> > Signed-off-by: Matthew Brost 
> > Cc: 
> 
> Unless "this bug" is some real world impact thing I wouldn't put cc:
> stable on this.

Got it.

Matt

> -Daniel
> > ---
> >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
> >  1 file changed, 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index de5f9c86b9a4..cafb0608ffb4 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request 
> > *rq)
> > if (p->flags & I915_DEPENDENCY_WEAK)
> > continue;
> >  
> > -   /* Propagate any change in error status */
> > -   if (rq->fence.error)
> > -   i915_request_set_error_once(w, rq->fence.error);
> > -
> > if (w->engine != rq->engine)
> > continue;
> >  
> > -- 
> > 2.32.0
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

2021-08-17 Thread Daniel Vetter
On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote:
> Progagating errors to dependent fences is wrong, don't do it. Selftest
> in following patch exposes this bug.

Please explain what "this bug" is, it's hard to read minds, especially at
a distance in spacetime :-)

> Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to 
> children on unhold")

I think it would be better to outright revert this, instead of just
disabling it like this.

Also please cite the dma_fence error propagation revert from Jason:

commit 93a2711cddd5760e2f0f901817d71c93183c3b87
Author: Jason Ekstrand 
Date:   Wed Jul 14 14:34:16 2021 -0500

Revert "drm/i915: Propagate errors on awaiting already signaled fences"

Maybe in full, if you need the justification.

> Signed-off-by: Matthew Brost 
> Cc: 

Unless "this bug" is some real world impact thing I wouldn't put cc:
stable on this.
-Daniel
> ---
>  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de5f9c86b9a4..cafb0608ffb4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq)
>   if (p->flags & I915_DEPENDENCY_WEAK)
>   continue;
>  
> - /* Propagate any change in error status */
> - if (rq->fence.error)
> - i915_request_set_error_once(w, rq->fence.error);
> -
>   if (w->engine != rq->engine)
>   continue;
>  
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[Intel-gfx] [PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

2021-08-16 Thread Matthew Brost
Progagating errors to dependent fences is wrong, don't do it. Selftest
in following patch exposes this bug.

Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children 
on unhold")
Signed-off-by: Matthew Brost 
Cc: 
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de5f9c86b9a4..cafb0608ffb4 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq)
if (p->flags & I915_DEPENDENCY_WEAK)
continue;
 
-   /* Propagate any change in error status */
-   if (rq->fence.error)
-   i915_request_set_error_once(w, rq->fence.error);
-
if (w->engine != rq->engine)
continue;
 
-- 
2.32.0