Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully

2016-10-06 Thread Chris Wilson
On Thu, Oct 06, 2016 at 05:02:44PM +0300, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > On Thu, Oct 06, 2016 at 04:32:37PM +0300, Mika Kuoppala wrote:
> >> Chris Wilson  writes:
> >> 
> >> > Along with the interrupt, we want to restore the fake-irq and
> >> > wait-timeout detection. If we use the breadcrumbs interface to setup the
> >> > interrupt as it wants, the auxiliary timers will also be restored.
> >> >
> >> > Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete 
> >> > requests")
> >> > Signed-off-by: Chris Wilson 
> >> > Cc: Mika Kuoppala 
> >> > ---
> >> >  drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 +
> >> >  drivers/gpu/drm/i915/intel_engine_cs.c   | 15 ---
> >> >  drivers/gpu/drm/i915/intel_lrc.c |  2 +-
> >> >  drivers/gpu/drm/i915/intel_ringbuffer.c  |  2 +-
> >> >  drivers/gpu/drm/i915/intel_ringbuffer.h  |  2 +-
> >> >  5 files changed, 20 insertions(+), 18 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
> >> > b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >> > index 9dba4971fb1e..d27da6d69735 100644
> >> > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >> > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >> > @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct 
> >> > intel_engine_cs *engine)
> >> >  return 0;
> >> >  }
> >> >  
> >> > +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
> >> > +{
> >> > +struct intel_breadcrumbs *b = >breadcrumbs;
> >> > +
> >> 
> >> Should we kill the timer before proceeding in here?
> >
> > Which timer? In breadcrumbs.c, we are concerned with the fake_irq and
> > the wait-timeout. The wait-timeout is reset below, we should add the
> > code to cancel the fake_irq along with clearing the bit.
> 
> I was considering that irqs are enabled and we have a
> active breadcrumbs timer, triggering at the same time as
> reset happens. So we would enable the fake irq as a post reset
> race between reset/breadcrumbs hangcheck.
> 
> As in why not cancel and postpone the timer and only after
> clear the missed_irq?

So just picking up that we don't cancel the fake irq along with the
clear_bit() (currently justing for the wait to complete before
cancelling).

> >> Not relevant to this patch but I also noticed that the period
> >> is identical to hangcheck period. Multiple of hangcheck period
> >> would be better, as our kicking might help and we don't
> >> want to fallback to fake irqs just so easily.
> >
> > ?
> >
> > The main GPU hangcheck is kicked off by the wait timeout. Keeping the
> > two pieces independent (fake-irq, hangcheck) is quite nice, and the
> > jiffie wake up serves as a backup, and either it is required or it will
> > be disabled by the reset.
> 
> But we queue hangcheck also from retire work. So it could be that
> we fallback to fake irqs, even if next hangcheck might have
> managed to kick the wait and make forward progress?

Below the level of care. The limited kicking that hangcheck does is
immaterial to deciding whether or not we might need fake user interrupts.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully

2016-10-06 Thread Mika Kuoppala
Chris Wilson  writes:

> On Thu, Oct 06, 2016 at 04:32:37PM +0300, Mika Kuoppala wrote:
>> Chris Wilson  writes:
>> 
>> > Along with the interrupt, we want to restore the fake-irq and
>> > wait-timeout detection. If we use the breadcrumbs interface to setup the
>> > interrupt as it wants, the auxiliary timers will also be restored.
>> >
>> > Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete 
>> > requests")
>> > Signed-off-by: Chris Wilson 
>> > Cc: Mika Kuoppala 
>> > ---
>> >  drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 +
>> >  drivers/gpu/drm/i915/intel_engine_cs.c   | 15 ---
>> >  drivers/gpu/drm/i915/intel_lrc.c |  2 +-
>> >  drivers/gpu/drm/i915/intel_ringbuffer.c  |  2 +-
>> >  drivers/gpu/drm/i915/intel_ringbuffer.h  |  2 +-
>> >  5 files changed, 20 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
>> > b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>> > index 9dba4971fb1e..d27da6d69735 100644
>> > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
>> > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>> > @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct 
>> > intel_engine_cs *engine)
>> >return 0;
>> >  }
>> >  
>> > +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
>> > +{
>> > +  struct intel_breadcrumbs *b = >breadcrumbs;
>> > +
>> 
>> Should we kill the timer before proceeding in here?
>
> Which timer? In breadcrumbs.c, we are concerned with the fake_irq and
> the wait-timeout. The wait-timeout is reset below, we should add the
> code to cancel the fake_irq along with clearing the bit.

I was considering that irqs are enabled and we have a
active breadcrumbs timer, triggering at the same time as
reset happens. So we would enable the fake irq as a post reset
race between reset/breadcrumbs hangcheck.

As in why not cancel and postpone the timer and only after
clear the missed_irq?

>  
>> Not relevant to this patch but I also noticed that the period
>> is identical to hangcheck period. Multiple of hangcheck period
>> would be better, as our kicking might help and we don't
>> want to fallback to fake irqs just so easily.
>
> ?
>
> The main GPU hangcheck is kicked off by the wait timeout. Keeping the
> two pieces independent (fake-irq, hangcheck) is quite nice, and the
> jiffie wake up serves as a backup, and either it is required or it will
> be disabled by the reset.

But we queue hangcheck also from retire work. So it could be that
we fallback to fake irqs, even if next hangcheck might have
managed to kick the wait and make forward progress?

And perhaps we should rename the breadcrumb hangcheck
as wait_watchdog to avoid confusion between different independant
'hangchecks'

-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully

2016-10-06 Thread Chris Wilson
On Thu, Oct 06, 2016 at 04:32:37PM +0300, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > Along with the interrupt, we want to restore the fake-irq and
> > wait-timeout detection. If we use the breadcrumbs interface to setup the
> > interrupt as it wants, the auxiliary timers will also be restored.
> >
> > Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete 
> > requests")
> > Signed-off-by: Chris Wilson 
> > Cc: Mika Kuoppala 
> > ---
> >  drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 +
> >  drivers/gpu/drm/i915/intel_engine_cs.c   | 15 ---
> >  drivers/gpu/drm/i915/intel_lrc.c |  2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c  |  2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.h  |  2 +-
> >  5 files changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
> > b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> > index 9dba4971fb1e..d27da6d69735 100644
> > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> > @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct 
> > intel_engine_cs *engine)
> > return 0;
> >  }
> >  
> > +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
> > +{
> > +   struct intel_breadcrumbs *b = >breadcrumbs;
> > +
> 
> Should we kill the timer before proceeding in here?

Which timer? In breadcrumbs.c, we are concerned with the fake_irq and
the wait-timeout. The wait-timeout is reset below, we should add the
code to cancel the fake_irq along with clearing the bit.
 
> Not relevant to this patch but I also noticed that the period
> is identical to hangcheck period. Multiple of hangcheck period
> would be better, as our kicking might help and we don't
> want to fallback to fake irqs just so easily.

?

The main GPU hangcheck is kicked off by the wait timeout. Keeping the
two pieces independent (fake-irq, hangcheck) is quite nice, and the
jiffie wake up serves as a backup, and either it is required or it will
be disabled by the reset.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully

2016-10-06 Thread Mika Kuoppala
Chris Wilson  writes:

> Along with the interrupt, we want to restore the fake-irq and
> wait-timeout detection. If we use the breadcrumbs interface to setup the
> interrupt as it wants, the auxiliary timers will also be restored.
>
> Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests")
> Signed-off-by: Chris Wilson 
> Cc: Mika Kuoppala 
> ---
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 +
>  drivers/gpu/drm/i915/intel_engine_cs.c   | 15 ---
>  drivers/gpu/drm/i915/intel_lrc.c |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c  |  2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.h  |  2 +-
>  5 files changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
> b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 9dba4971fb1e..d27da6d69735 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct intel_engine_cs 
> *engine)
>   return 0;
>  }
>  
> +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
> +{
> + struct intel_breadcrumbs *b = >breadcrumbs;
> +

Should we kill the timer before proceeding in here?

Not relevant to this patch but I also noticed that the period
is identical to hangcheck period. Multiple of hangcheck period
would be better, as our kicking might help and we don't
want to fallback to fake irqs just so easily.

-Mika

> + clear_bit(engine->id, >i915->gpu_error.missed_irq_rings);
> +
> + spin_lock(>lock);
> +
> + __intel_breadcrumbs_disable_irq(b);
> + if (intel_engine_has_waiter(engine)) {
> + b->timeout = wait_timeout();
> + __intel_breadcrumbs_enable_irq(b);
> + }
> +
> + spin_unlock(>lock);
> +}
> +
>  void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
>  {
>   struct intel_breadcrumbs *b = >breadcrumbs;
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
> b/drivers/gpu/drm/i915/intel_engine_cs.c
> index c8ac72ba4000..755f1a8b76d8 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -210,9 +210,6 @@ void intel_engine_init_global_seqno(struct 
> intel_engine_cs *engine, u32 seqno)
>  void intel_engine_init_hangcheck(struct intel_engine_cs *engine)
>  {
>   memset(>hangcheck, 0, sizeof(engine->hangcheck));
> - clear_bit(engine->id, >i915->gpu_error.missed_irq_rings);
> - if (intel_engine_has_waiter(engine))
> - i915_queue_hangcheck(engine->i915);
>  }
>  
>  static void intel_engine_init_timeline(struct intel_engine_cs *engine)
> @@ -308,18 +305,6 @@ int intel_engine_init_common(struct intel_engine_cs 
> *engine)
>   return 0;
>  }
>  
> -void intel_engine_reset_irq(struct intel_engine_cs *engine)
> -{
> - struct drm_i915_private *dev_priv = engine->i915;
> -
> - spin_lock_irq(_priv->irq_lock);
> - if (intel_engine_has_waiter(engine))
> - engine->irq_enable(engine);
> - else
> - engine->irq_disable(engine);
> - spin_unlock_irq(_priv->irq_lock);
> -}
> -
>  /**
>   * intel_engines_cleanup_common - cleans up the engine state created by
>   *the common initiailizers.
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index bf22c94c3d53..eb162553cff2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1199,7 +1199,7 @@ static int gen8_init_common_ring(struct intel_engine_cs 
> *engine)
>  
>   lrc_init_hws(engine);
>  
> - intel_engine_reset_irq(engine);
> + intel_engine_reset_breadcrumbs(engine);
>  
>   I915_WRITE(RING_HWSTAM(engine->mmio_base), 0x);
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 4bc47af68454..3abfbe3cfed9 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -548,7 +548,7 @@ static int init_ring_common(struct intel_engine_cs 
> *engine)
>   else
>   intel_ring_setup_status_page(engine);
>  
> - intel_engine_reset_irq(engine);
> + intel_engine_reset_breadcrumbs(engine);
>  
>   /* Enforce ordering by reading HEAD register back */
>   I915_READ_HEAD(engine);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
> b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 29d37b7c6021..a888f68d63d9 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -482,7 +482,6 @@ int __intel_ring_space(int head, int tail, int size);
>  void intel_ring_update_space(struct intel_ring *ring);
>  
>  void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 
> seqno);
> -void intel_engine_reset_irq(struct intel_engine_cs *engine);
>  
>  void 

[Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully

2016-10-06 Thread Chris Wilson
Along with the interrupt, we want to restore the fake-irq and
wait-timeout detection. If we use the breadcrumbs interface to setup the
interrupt as it wants, the auxiliary timers will also be restored.

Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 +
 drivers/gpu/drm/i915/intel_engine_cs.c   | 15 ---
 drivers/gpu/drm/i915/intel_lrc.c |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |  2 +-
 5 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 9dba4971fb1e..d27da6d69735 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct intel_engine_cs 
*engine)
return 0;
 }
 
+void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
+{
+   struct intel_breadcrumbs *b = >breadcrumbs;
+
+   clear_bit(engine->id, >i915->gpu_error.missed_irq_rings);
+
+   spin_lock(>lock);
+
+   __intel_breadcrumbs_disable_irq(b);
+   if (intel_engine_has_waiter(engine)) {
+   b->timeout = wait_timeout();
+   __intel_breadcrumbs_enable_irq(b);
+   }
+
+   spin_unlock(>lock);
+}
+
 void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
 {
struct intel_breadcrumbs *b = >breadcrumbs;
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index c8ac72ba4000..755f1a8b76d8 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -210,9 +210,6 @@ void intel_engine_init_global_seqno(struct intel_engine_cs 
*engine, u32 seqno)
 void intel_engine_init_hangcheck(struct intel_engine_cs *engine)
 {
memset(>hangcheck, 0, sizeof(engine->hangcheck));
-   clear_bit(engine->id, >i915->gpu_error.missed_irq_rings);
-   if (intel_engine_has_waiter(engine))
-   i915_queue_hangcheck(engine->i915);
 }
 
 static void intel_engine_init_timeline(struct intel_engine_cs *engine)
@@ -308,18 +305,6 @@ int intel_engine_init_common(struct intel_engine_cs 
*engine)
return 0;
 }
 
-void intel_engine_reset_irq(struct intel_engine_cs *engine)
-{
-   struct drm_i915_private *dev_priv = engine->i915;
-
-   spin_lock_irq(_priv->irq_lock);
-   if (intel_engine_has_waiter(engine))
-   engine->irq_enable(engine);
-   else
-   engine->irq_disable(engine);
-   spin_unlock_irq(_priv->irq_lock);
-}
-
 /**
  * intel_engines_cleanup_common - cleans up the engine state created by
  *the common initiailizers.
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index bf22c94c3d53..eb162553cff2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1199,7 +1199,7 @@ static int gen8_init_common_ring(struct intel_engine_cs 
*engine)
 
lrc_init_hws(engine);
 
-   intel_engine_reset_irq(engine);
+   intel_engine_reset_breadcrumbs(engine);
 
I915_WRITE(RING_HWSTAM(engine->mmio_base), 0x);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4bc47af68454..3abfbe3cfed9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -548,7 +548,7 @@ static int init_ring_common(struct intel_engine_cs *engine)
else
intel_ring_setup_status_page(engine);
 
-   intel_engine_reset_irq(engine);
+   intel_engine_reset_breadcrumbs(engine);
 
/* Enforce ordering by reading HEAD register back */
I915_READ_HEAD(engine);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 29d37b7c6021..a888f68d63d9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -482,7 +482,6 @@ int __intel_ring_space(int head, int tail, int size);
 void intel_ring_update_space(struct intel_ring *ring);
 
 void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 seqno);
-void intel_engine_reset_irq(struct intel_engine_cs *engine);
 
 void intel_engine_setup_common(struct intel_engine_cs *engine);
 int intel_engine_init_common(struct intel_engine_cs *engine);
@@ -568,6 +567,7 @@ static inline bool intel_engine_wakeup(const struct 
intel_engine_cs *engine)
return wakeup;
 }
 
+void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine);
 void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
 unsigned int intel_kick_waiters(struct drm_i915_private *i915);
 unsigned int intel_kick_signalers(struct drm_i915_private *i915);