Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully
On Thu, Oct 06, 2016 at 05:02:44PM +0300, Mika Kuoppala wrote: > Chris Wilsonwrites: > > > On Thu, Oct 06, 2016 at 04:32:37PM +0300, Mika Kuoppala wrote: > >> Chris Wilson writes: > >> > >> > Along with the interrupt, we want to restore the fake-irq and > >> > wait-timeout detection. If we use the breadcrumbs interface to setup the > >> > interrupt as it wants, the auxiliary timers will also be restored. > >> > > >> > Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete > >> > requests") > >> > Signed-off-by: Chris Wilson > >> > Cc: Mika Kuoppala > >> > --- > >> > drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 + > >> > drivers/gpu/drm/i915/intel_engine_cs.c | 15 --- > >> > drivers/gpu/drm/i915/intel_lrc.c | 2 +- > >> > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- > >> > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- > >> > 5 files changed, 20 insertions(+), 18 deletions(-) > >> > > >> > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c > >> > b/drivers/gpu/drm/i915/intel_breadcrumbs.c > >> > index 9dba4971fb1e..d27da6d69735 100644 > >> > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c > >> > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c > >> > @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct > >> > intel_engine_cs *engine) > >> > return 0; > >> > } > >> > > >> > +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine) > >> > +{ > >> > +struct intel_breadcrumbs *b = >breadcrumbs; > >> > + > >> > >> Should we kill the timer before proceeding in here? > > > > Which timer? In breadcrumbs.c, we are concerned with the fake_irq and > > the wait-timeout. The wait-timeout is reset below, we should add the > > code to cancel the fake_irq along with clearing the bit. > > I was considering that irqs are enabled and we have a > active breadcrumbs timer, triggering at the same time as > reset happens. So we would enable the fake irq as a post reset > race between reset/breadcrumbs hangcheck. > > As in why not cancel and postpone the timer and only after > clear the missed_irq? So just picking up that we don't cancel the fake irq along with the clear_bit() (currently justing for the wait to complete before cancelling). > >> Not relevant to this patch but I also noticed that the period > >> is identical to hangcheck period. Multiple of hangcheck period > >> would be better, as our kicking might help and we don't > >> want to fallback to fake irqs just so easily. > > > > ? > > > > The main GPU hangcheck is kicked off by the wait timeout. Keeping the > > two pieces independent (fake-irq, hangcheck) is quite nice, and the > > jiffie wake up serves as a backup, and either it is required or it will > > be disabled by the reset. > > But we queue hangcheck also from retire work. So it could be that > we fallback to fake irqs, even if next hangcheck might have > managed to kick the wait and make forward progress? Below the level of care. The limited kicking that hangcheck does is immaterial to deciding whether or not we might need fake user interrupts. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully
Chris Wilsonwrites: > On Thu, Oct 06, 2016 at 04:32:37PM +0300, Mika Kuoppala wrote: >> Chris Wilson writes: >> >> > Along with the interrupt, we want to restore the fake-irq and >> > wait-timeout detection. If we use the breadcrumbs interface to setup the >> > interrupt as it wants, the auxiliary timers will also be restored. >> > >> > Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete >> > requests") >> > Signed-off-by: Chris Wilson >> > Cc: Mika Kuoppala >> > --- >> > drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 + >> > drivers/gpu/drm/i915/intel_engine_cs.c | 15 --- >> > drivers/gpu/drm/i915/intel_lrc.c | 2 +- >> > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- >> > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- >> > 5 files changed, 20 insertions(+), 18 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c >> > b/drivers/gpu/drm/i915/intel_breadcrumbs.c >> > index 9dba4971fb1e..d27da6d69735 100644 >> > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c >> > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c >> > @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct >> > intel_engine_cs *engine) >> >return 0; >> > } >> > >> > +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine) >> > +{ >> > + struct intel_breadcrumbs *b = >breadcrumbs; >> > + >> >> Should we kill the timer before proceeding in here? > > Which timer? In breadcrumbs.c, we are concerned with the fake_irq and > the wait-timeout. The wait-timeout is reset below, we should add the > code to cancel the fake_irq along with clearing the bit. I was considering that irqs are enabled and we have a active breadcrumbs timer, triggering at the same time as reset happens. So we would enable the fake irq as a post reset race between reset/breadcrumbs hangcheck. As in why not cancel and postpone the timer and only after clear the missed_irq? > >> Not relevant to this patch but I also noticed that the period >> is identical to hangcheck period. Multiple of hangcheck period >> would be better, as our kicking might help and we don't >> want to fallback to fake irqs just so easily. > > ? > > The main GPU hangcheck is kicked off by the wait timeout. Keeping the > two pieces independent (fake-irq, hangcheck) is quite nice, and the > jiffie wake up serves as a backup, and either it is required or it will > be disabled by the reset. But we queue hangcheck also from retire work. So it could be that we fallback to fake irqs, even if next hangcheck might have managed to kick the wait and make forward progress? And perhaps we should rename the breadcrumb hangcheck as wait_watchdog to avoid confusion between different independant 'hangchecks' -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully
On Thu, Oct 06, 2016 at 04:32:37PM +0300, Mika Kuoppala wrote: > Chris Wilsonwrites: > > > Along with the interrupt, we want to restore the fake-irq and > > wait-timeout detection. If we use the breadcrumbs interface to setup the > > interrupt as it wants, the auxiliary timers will also be restored. > > > > Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete > > requests") > > Signed-off-by: Chris Wilson > > Cc: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 + > > drivers/gpu/drm/i915/intel_engine_cs.c | 15 --- > > drivers/gpu/drm/i915/intel_lrc.c | 2 +- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- > > 5 files changed, 20 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c > > b/drivers/gpu/drm/i915/intel_breadcrumbs.c > > index 9dba4971fb1e..d27da6d69735 100644 > > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c > > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c > > @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct > > intel_engine_cs *engine) > > return 0; > > } > > > > +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine) > > +{ > > + struct intel_breadcrumbs *b = >breadcrumbs; > > + > > Should we kill the timer before proceeding in here? Which timer? In breadcrumbs.c, we are concerned with the fake_irq and the wait-timeout. The wait-timeout is reset below, we should add the code to cancel the fake_irq along with clearing the bit. > Not relevant to this patch but I also noticed that the period > is identical to hangcheck period. Multiple of hangcheck period > would be better, as our kicking might help and we don't > want to fallback to fake irqs just so easily. ? The main GPU hangcheck is kicked off by the wait timeout. Keeping the two pieces independent (fake-irq, hangcheck) is quite nice, and the jiffie wake up serves as a backup, and either it is required or it will be disabled by the reset. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully
Chris Wilsonwrites: > Along with the interrupt, we want to restore the fake-irq and > wait-timeout detection. If we use the breadcrumbs interface to setup the > interrupt as it wants, the auxiliary timers will also be restored. > > Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests") > Signed-off-by: Chris Wilson > Cc: Mika Kuoppala > --- > drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 + > drivers/gpu/drm/i915/intel_engine_cs.c | 15 --- > drivers/gpu/drm/i915/intel_lrc.c | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- > 5 files changed, 20 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c > b/drivers/gpu/drm/i915/intel_breadcrumbs.c > index 9dba4971fb1e..d27da6d69735 100644 > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c > @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct intel_engine_cs > *engine) > return 0; > } > > +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine) > +{ > + struct intel_breadcrumbs *b = >breadcrumbs; > + Should we kill the timer before proceeding in here? Not relevant to this patch but I also noticed that the period is identical to hangcheck period. Multiple of hangcheck period would be better, as our kicking might help and we don't want to fallback to fake irqs just so easily. -Mika > + clear_bit(engine->id, >i915->gpu_error.missed_irq_rings); > + > + spin_lock(>lock); > + > + __intel_breadcrumbs_disable_irq(b); > + if (intel_engine_has_waiter(engine)) { > + b->timeout = wait_timeout(); > + __intel_breadcrumbs_enable_irq(b); > + } > + > + spin_unlock(>lock); > +} > + > void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine) > { > struct intel_breadcrumbs *b = >breadcrumbs; > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c > b/drivers/gpu/drm/i915/intel_engine_cs.c > index c8ac72ba4000..755f1a8b76d8 100644 > --- a/drivers/gpu/drm/i915/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c > @@ -210,9 +210,6 @@ void intel_engine_init_global_seqno(struct > intel_engine_cs *engine, u32 seqno) > void intel_engine_init_hangcheck(struct intel_engine_cs *engine) > { > memset(>hangcheck, 0, sizeof(engine->hangcheck)); > - clear_bit(engine->id, >i915->gpu_error.missed_irq_rings); > - if (intel_engine_has_waiter(engine)) > - i915_queue_hangcheck(engine->i915); > } > > static void intel_engine_init_timeline(struct intel_engine_cs *engine) > @@ -308,18 +305,6 @@ int intel_engine_init_common(struct intel_engine_cs > *engine) > return 0; > } > > -void intel_engine_reset_irq(struct intel_engine_cs *engine) > -{ > - struct drm_i915_private *dev_priv = engine->i915; > - > - spin_lock_irq(_priv->irq_lock); > - if (intel_engine_has_waiter(engine)) > - engine->irq_enable(engine); > - else > - engine->irq_disable(engine); > - spin_unlock_irq(_priv->irq_lock); > -} > - > /** > * intel_engines_cleanup_common - cleans up the engine state created by > *the common initiailizers. > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index bf22c94c3d53..eb162553cff2 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1199,7 +1199,7 @@ static int gen8_init_common_ring(struct intel_engine_cs > *engine) > > lrc_init_hws(engine); > > - intel_engine_reset_irq(engine); > + intel_engine_reset_breadcrumbs(engine); > > I915_WRITE(RING_HWSTAM(engine->mmio_base), 0x); > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 4bc47af68454..3abfbe3cfed9 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -548,7 +548,7 @@ static int init_ring_common(struct intel_engine_cs > *engine) > else > intel_ring_setup_status_page(engine); > > - intel_engine_reset_irq(engine); > + intel_engine_reset_breadcrumbs(engine); > > /* Enforce ordering by reading HEAD register back */ > I915_READ_HEAD(engine); > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > b/drivers/gpu/drm/i915/intel_ringbuffer.h > index 29d37b7c6021..a888f68d63d9 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -482,7 +482,6 @@ int __intel_ring_space(int head, int tail, int size); > void intel_ring_update_space(struct intel_ring *ring); > > void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 > seqno); > -void intel_engine_reset_irq(struct intel_engine_cs *engine); > > void
[Intel-gfx] [PATCH] drm/i915: Reset the breadcrumbs IRQ more carefully
Along with the interrupt, we want to restore the fake-irq and wait-timeout detection. If we use the breadcrumbs interface to setup the interrupt as it wants, the auxiliary timers will also be restored. Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: Chris WilsonCc: Mika Kuoppala --- drivers/gpu/drm/i915/intel_breadcrumbs.c | 17 + drivers/gpu/drm/i915/intel_engine_cs.c | 15 --- drivers/gpu/drm/i915/intel_lrc.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 5 files changed, 20 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c index 9dba4971fb1e..d27da6d69735 100644 --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c @@ -584,6 +584,23 @@ int intel_engine_init_breadcrumbs(struct intel_engine_cs *engine) return 0; } +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine) +{ + struct intel_breadcrumbs *b = >breadcrumbs; + + clear_bit(engine->id, >i915->gpu_error.missed_irq_rings); + + spin_lock(>lock); + + __intel_breadcrumbs_disable_irq(b); + if (intel_engine_has_waiter(engine)) { + b->timeout = wait_timeout(); + __intel_breadcrumbs_enable_irq(b); + } + + spin_unlock(>lock); +} + void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine) { struct intel_breadcrumbs *b = >breadcrumbs; diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index c8ac72ba4000..755f1a8b76d8 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -210,9 +210,6 @@ void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 seqno) void intel_engine_init_hangcheck(struct intel_engine_cs *engine) { memset(>hangcheck, 0, sizeof(engine->hangcheck)); - clear_bit(engine->id, >i915->gpu_error.missed_irq_rings); - if (intel_engine_has_waiter(engine)) - i915_queue_hangcheck(engine->i915); } static void intel_engine_init_timeline(struct intel_engine_cs *engine) @@ -308,18 +305,6 @@ int intel_engine_init_common(struct intel_engine_cs *engine) return 0; } -void intel_engine_reset_irq(struct intel_engine_cs *engine) -{ - struct drm_i915_private *dev_priv = engine->i915; - - spin_lock_irq(_priv->irq_lock); - if (intel_engine_has_waiter(engine)) - engine->irq_enable(engine); - else - engine->irq_disable(engine); - spin_unlock_irq(_priv->irq_lock); -} - /** * intel_engines_cleanup_common - cleans up the engine state created by *the common initiailizers. diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index bf22c94c3d53..eb162553cff2 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1199,7 +1199,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine) lrc_init_hws(engine); - intel_engine_reset_irq(engine); + intel_engine_reset_breadcrumbs(engine); I915_WRITE(RING_HWSTAM(engine->mmio_base), 0x); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4bc47af68454..3abfbe3cfed9 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -548,7 +548,7 @@ static int init_ring_common(struct intel_engine_cs *engine) else intel_ring_setup_status_page(engine); - intel_engine_reset_irq(engine); + intel_engine_reset_breadcrumbs(engine); /* Enforce ordering by reading HEAD register back */ I915_READ_HEAD(engine); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 29d37b7c6021..a888f68d63d9 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -482,7 +482,6 @@ int __intel_ring_space(int head, int tail, int size); void intel_ring_update_space(struct intel_ring *ring); void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 seqno); -void intel_engine_reset_irq(struct intel_engine_cs *engine); void intel_engine_setup_common(struct intel_engine_cs *engine); int intel_engine_init_common(struct intel_engine_cs *engine); @@ -568,6 +567,7 @@ static inline bool intel_engine_wakeup(const struct intel_engine_cs *engine) return wakeup; } +void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine); void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine); unsigned int intel_kick_waiters(struct drm_i915_private *i915); unsigned int intel_kick_signalers(struct drm_i915_private *i915);