Re: [Intel-gfx] [PATCH v2 4/6] drm/i915: Disable PSR when a PSR aux error happen

2018-10-24 Thread Souza, Jose
On Wed, 2018-10-24 at 16:22 -0700, Dhinakaran Pandiyan wrote:
> On Wed, 2018-10-24 at 15:08 -0700, Dhinakaran Pandiyan wrote:
> > On Sat, 2018-10-20 at 00:12 +, Souza, Jose wrote:
> > > On Fri, 2018-10-19 at 16:14 -0700, Dhinakaran Pandiyan wrote:
> > > > On Wed, 2018-10-10 at 17:41 -0700, José Roberto de Souza wrote:
> > > > > While PSR is active hardware will do aux transactions by it
> > > > > self
> > > > > to
> > > > > wakeup sink to receive a new frame when necessary. If that
> > > > > transaction is not acked by sink, hardware will trigger this
> > > > > interruption.
> > > > > 
> > > > > So let's disable PSR as it is a hint that there is problem
> > > > > with
> > > > > this
> > > > > sink.
> > > > > 
> > > > > The removed FIXME was asking to manually train the link but
> > > > > we
> > > > > don't
> > > > > need to do that as by spec sink should do a short pulse when
> > > > > it
> > > > > is
> > > > > out of sync with source, we just need to make sure it is
> > > > > awaken
> > > > > and
> > > > > the SDP header with PSR disable will trigger this condition.
> > > > > 
> > > > > Cc: Dhinakaran Pandiyan 
> > > > > Signed-off-by: José Roberto de Souza 
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_drv.h  |  1 +
> > > > >  drivers/gpu/drm/i915/intel_psr.c | 39
> > > > > 
> > > > > 
> > > > >  2 files changed, 36 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > index 3017ef037fed..e8ba00dd2c51 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -638,6 +638,7 @@ struct i915_psr {
> > > > >   u8 sink_sync_latency;
> > > > >   ktime_t last_entry_attempt;
> > > > >   ktime_t last_exit;
> > > > > + u32 irq_aux_error;
> > > > >  };
> > > > >  
> > > > >  enum intel_pch {
> > > > > diff --git a/drivers/gpu/drm/i915/intel_psr.c
> > > > > b/drivers/gpu/drm/i915/intel_psr.c
> > > > > index 70d4e26e17b5..ad09130cb4ad 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_psr.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_psr.c
> > > > > @@ -159,10 +159,16 @@ void intel_psr_irq_handler(struct
> > > > > drm_i915_private *dev_priv, u32 psr_iir)
> > > > >  BIT(TRANSCODER_C);
> > > > >  
> > > > >   for_each_cpu_transcoder_masked(dev_priv,
> > > > > cpu_transcoder,
> > > > > transcoders) {
> > > > > - /* FIXME: Exit PSR and link train manually when
> > > > > this
> > > > > happens. */
> > > > > - if (psr_iir & EDP_PSR_ERROR(cpu_transcoder))
> > > > > - DRM_DEBUG_KMS("[transcoder %s] PSR aux
> > > > > error\n",
> > > > > -   transcoder_name(cpu_trans
> > > > > coder));
> > > > > + if (psr_iir & EDP_PSR_ERROR(cpu_transcoder)) {
> > > > > + DRM_WARN("[transcoder %s] PSR aux
> > > > > error\n",
> > > > > +  transcoder_name(cpu_transcoder
> > > > > ));
> > > > 
> > > > Downgrade this to debug since the error is handled in the
> > > > driver? 
> > > 
> > > I think is better keep as DRM_WARN so it is shown in regular
> > > kernel
> > > logs this way if a user opens a bug complaning why PSR is
> > > disabled
> > > we
> > > can check that is because of PSR aux error.
> > > 
> > > > > +
> > > > > + spin_lock(_priv->irq_lock);
> > 
> > This lock isn't needed either. How about setting a bool only if the
> > transcoder is eDP and then scheduling a disable.
> > 
> > > > > + dev_priv->psr.irq_aux_error |=
> > > > > BIT(cpu_transcoder);
> > > > 
> > > > Just ignore the non eDP bits, I don't think we want to do
> > > > anything
> > > > with
> > > > the information that some other bit was set.
> > > > 
> > > > > + spin_unlock(_priv->irq_lock);
> > > > > +
> > > > > + schedule_work(_priv->psr.work);
> > > > > + }
> > > > >  
> > > > >   if (psr_iir &
> > > > > EDP_PSR_PRE_ENTRY(cpu_transcoder)) {
> > > > >   dev_priv->psr.last_entry_attempt =
> > > > > time_ns;
> > > > > @@ -893,11 +899,36 @@ int intel_psr_set_debugfs_mode(struct
> > > > > drm_i915_private *dev_priv,
> > > > >   return ret;
> > > > >  }
> > > > >  
> > > > > +static void intel_psr_handle_irq(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +{
> > > > > + struct i915_psr *psr = _priv->psr;
> > > > > + u32 irq_aux_error;
> > > > > +
> > > > > + spin_lock_irq(_priv->irq_lock);
> > > > > + irq_aux_error = psr->irq_aux_error;
> > > > > + psr->irq_aux_error = 0;
> > > > 
> > > > A subsequent modeset will enable PSR again. I don't expect a
> > > > modeset
> > > > to
> > > > to be able to fix an AUX wake up failure, so might as well
> > > > disable
> > > > it
> > > > for good.
> > > 
> > > Add another field to do that or set sink_support=false? I 

Re: [Intel-gfx] [PATCH v2 4/6] drm/i915: Disable PSR when a PSR aux error happen

2018-10-24 Thread Dhinakaran Pandiyan
On Wed, 2018-10-24 at 15:08 -0700, Dhinakaran Pandiyan wrote:
> On Sat, 2018-10-20 at 00:12 +, Souza, Jose wrote:
> > On Fri, 2018-10-19 at 16:14 -0700, Dhinakaran Pandiyan wrote:
> > > On Wed, 2018-10-10 at 17:41 -0700, José Roberto de Souza wrote:
> > > > While PSR is active hardware will do aux transactions by it
> > > > self
> > > > to
> > > > wakeup sink to receive a new frame when necessary. If that
> > > > transaction is not acked by sink, hardware will trigger this
> > > > interruption.
> > > > 
> > > > So let's disable PSR as it is a hint that there is problem with
> > > > this
> > > > sink.
> > > > 
> > > > The removed FIXME was asking to manually train the link but we
> > > > don't
> > > > need to do that as by spec sink should do a short pulse when it
> > > > is
> > > > out of sync with source, we just need to make sure it is awaken
> > > > and
> > > > the SDP header with PSR disable will trigger this condition.
> > > > 
> > > > Cc: Dhinakaran Pandiyan 
> > > > Signed-off-by: José Roberto de Souza 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.h  |  1 +
> > > >  drivers/gpu/drm/i915/intel_psr.c | 39
> > > > 
> > > > 
> > > >  2 files changed, 36 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 3017ef037fed..e8ba00dd2c51 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -638,6 +638,7 @@ struct i915_psr {
> > > > u8 sink_sync_latency;
> > > > ktime_t last_entry_attempt;
> > > > ktime_t last_exit;
> > > > +   u32 irq_aux_error;
> > > >  };
> > > >  
> > > >  enum intel_pch {
> > > > diff --git a/drivers/gpu/drm/i915/intel_psr.c
> > > > b/drivers/gpu/drm/i915/intel_psr.c
> > > > index 70d4e26e17b5..ad09130cb4ad 100644
> > > > --- a/drivers/gpu/drm/i915/intel_psr.c
> > > > +++ b/drivers/gpu/drm/i915/intel_psr.c
> > > > @@ -159,10 +159,16 @@ void intel_psr_irq_handler(struct
> > > > drm_i915_private *dev_priv, u32 psr_iir)
> > > >BIT(TRANSCODER_C);
> > > >  
> > > > for_each_cpu_transcoder_masked(dev_priv,
> > > > cpu_transcoder,
> > > > transcoders) {
> > > > -   /* FIXME: Exit PSR and link train manually when
> > > > this
> > > > happens. */
> > > > -   if (psr_iir & EDP_PSR_ERROR(cpu_transcoder))
> > > > -   DRM_DEBUG_KMS("[transcoder %s] PSR aux
> > > > error\n",
> > > > - transcoder_name(cpu_trans
> > > > coder));
> > > > +   if (psr_iir & EDP_PSR_ERROR(cpu_transcoder)) {
> > > > +   DRM_WARN("[transcoder %s] PSR aux
> > > > error\n",
> > > > +transcoder_name(cpu_transcoder
> > > > ));
> > > 
> > > Downgrade this to debug since the error is handled in the
> > > driver? 
> > 
> > I think is better keep as DRM_WARN so it is shown in regular kernel
> > logs this way if a user opens a bug complaning why PSR is disabled
> > we
> > can check that is because of PSR aux error.
> > 
> > > 
> > > > +
> > > > +   spin_lock(_priv->irq_lock);
> 
> This lock isn't needed either. How about setting a bool only if the
> transcoder is eDP and then scheduling a disable.
> 
> > > > +   dev_priv->psr.irq_aux_error |=
> > > > BIT(cpu_transcoder);
> > > 
> > > Just ignore the non eDP bits, I don't think we want to do
> > > anything
> > > with
> > > the information that some other bit was set.
> > > 
> > > > +   spin_unlock(_priv->irq_lock);
> > > > +
> > > > +   schedule_work(_priv->psr.work);
> > > > +   }
> > > >  
> > > > if (psr_iir &
> > > > EDP_PSR_PRE_ENTRY(cpu_transcoder)) {
> > > > dev_priv->psr.last_entry_attempt =
> > > > time_ns;
> > > > @@ -893,11 +899,36 @@ int intel_psr_set_debugfs_mode(struct
> > > > drm_i915_private *dev_priv,
> > > > return ret;
> > > >  }
> > > >  
> > > > +static void intel_psr_handle_irq(struct drm_i915_private
> > > > *dev_priv)
> > > > +{
> > > > +   struct i915_psr *psr = _priv->psr;
> > > > +   u32 irq_aux_error;
> > > > +
> > > > +   spin_lock_irq(_priv->irq_lock);
> > > > +   irq_aux_error = psr->irq_aux_error;
> > > > +   psr->irq_aux_error = 0;
> > > 
> > > A subsequent modeset will enable PSR again. I don't expect a
> > > modeset
> > > to
> > > to be able to fix an AUX wake up failure, so might as well
> > > disable
> > > it
> > > for good.
> > 
> > Add another field to do that or set sink_support=false? I guess PSR
> > short pulses errors should also disable it good too?
> 
> Reusing sink_support will get confusing, particularly because it is
> exposed in debugfs.
> 
> > 
> > > 
> > > > +   spin_unlock_irq(_priv->irq_lock);
> > > > +
> > > > +   /* right now PSR is only enabled in eDP */
> > > 
> > > 

Re: [Intel-gfx] [PATCH v2 4/6] drm/i915: Disable PSR when a PSR aux error happen

2018-10-24 Thread Dhinakaran Pandiyan
On Sat, 2018-10-20 at 00:12 +, Souza, Jose wrote:
> On Fri, 2018-10-19 at 16:14 -0700, Dhinakaran Pandiyan wrote:
> > On Wed, 2018-10-10 at 17:41 -0700, José Roberto de Souza wrote:
> > > While PSR is active hardware will do aux transactions by it self
> > > to
> > > wakeup sink to receive a new frame when necessary. If that
> > > transaction is not acked by sink, hardware will trigger this
> > > interruption.
> > > 
> > > So let's disable PSR as it is a hint that there is problem with
> > > this
> > > sink.
> > > 
> > > The removed FIXME was asking to manually train the link but we
> > > don't
> > > need to do that as by spec sink should do a short pulse when it
> > > is
> > > out of sync with source, we just need to make sure it is awaken
> > > and
> > > the SDP header with PSR disable will trigger this condition.
> > > 
> > > Cc: Dhinakaran Pandiyan 
> > > Signed-off-by: José Roberto de Souza 
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h  |  1 +
> > >  drivers/gpu/drm/i915/intel_psr.c | 39
> > > 
> > > 
> > >  2 files changed, 36 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 3017ef037fed..e8ba00dd2c51 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -638,6 +638,7 @@ struct i915_psr {
> > >   u8 sink_sync_latency;
> > >   ktime_t last_entry_attempt;
> > >   ktime_t last_exit;
> > > + u32 irq_aux_error;
> > >  };
> > >  
> > >  enum intel_pch {
> > > diff --git a/drivers/gpu/drm/i915/intel_psr.c
> > > b/drivers/gpu/drm/i915/intel_psr.c
> > > index 70d4e26e17b5..ad09130cb4ad 100644
> > > --- a/drivers/gpu/drm/i915/intel_psr.c
> > > +++ b/drivers/gpu/drm/i915/intel_psr.c
> > > @@ -159,10 +159,16 @@ void intel_psr_irq_handler(struct
> > > drm_i915_private *dev_priv, u32 psr_iir)
> > >  BIT(TRANSCODER_C);
> > >  
> > >   for_each_cpu_transcoder_masked(dev_priv, cpu_transcoder,
> > > transcoders) {
> > > - /* FIXME: Exit PSR and link train manually when this
> > > happens. */
> > > - if (psr_iir & EDP_PSR_ERROR(cpu_transcoder))
> > > - DRM_DEBUG_KMS("[transcoder %s] PSR aux
> > > error\n",
> > > -   transcoder_name(cpu_transcoder));
> > > + if (psr_iir & EDP_PSR_ERROR(cpu_transcoder)) {
> > > + DRM_WARN("[transcoder %s] PSR aux error\n",
> > > +  transcoder_name(cpu_transcoder));
> > 
> > Downgrade this to debug since the error is handled in the driver? 
> 
> I think is better keep as DRM_WARN so it is shown in regular kernel
> logs this way if a user opens a bug complaning why PSR is disabled we
> can check that is because of PSR aux error.
> 
> > 
> > > +
> > > + spin_lock(_priv->irq_lock);
This lock isn't needed either. How about setting a bool only if the
transcoder is eDP and then scheduling a disable.

> > > + dev_priv->psr.irq_aux_error |=
> > > BIT(cpu_transcoder);
> > 
> > Just ignore the non eDP bits, I don't think we want to do anything
> > with
> > the information that some other bit was set.
> > 
> > > + spin_unlock(_priv->irq_lock);
> > > +
> > > + schedule_work(_priv->psr.work);
> > > + }
> > >  
> > >   if (psr_iir & EDP_PSR_PRE_ENTRY(cpu_transcoder)) {
> > >   dev_priv->psr.last_entry_attempt = time_ns;
> > > @@ -893,11 +899,36 @@ int intel_psr_set_debugfs_mode(struct
> > > drm_i915_private *dev_priv,
> > >   return ret;
> > >  }
> > >  
> > > +static void intel_psr_handle_irq(struct drm_i915_private
> > > *dev_priv)
> > > +{
> > > + struct i915_psr *psr = _priv->psr;
> > > + u32 irq_aux_error;
> > > +
> > > + spin_lock_irq(_priv->irq_lock);
> > > + irq_aux_error = psr->irq_aux_error;
> > > + psr->irq_aux_error = 0;
> > 
> > A subsequent modeset will enable PSR again. I don't expect a
> > modeset
> > to
> > to be able to fix an AUX wake up failure, so might as well disable
> > it
> > for good.
> 
> Add another field to do that or set sink_support=false? I guess PSR
> short pulses errors should also disable it good too?
Reusing sink_support will get confusing, particularly because it is
exposed in debugfs.

> 
> > 
> > > + spin_unlock_irq(_priv->irq_lock);
> > > +
> > > + /* right now PSR is only enabled in eDP */
> > 
> > "right now" implies that PSR could be enabled for non eDP ports,
> > but
> > that's not the case.
> > 
> > 
> > > + WARN_ON(irq_aux_error & ~BIT(TRANSCODER_EDP));
> > 
> > This should go away if you ignore non-EDP bits, and a stack trace
> > isn't
> > particularly useful anyway.
> 
> Okay I will remove this handlings for other transcoders.
> 
> > 
> > > +
> > > + mutex_lock(>lock);
> > 
> > Is this sufficient? Don't we have to serialize against ongoing
> > modesets
> > like we do for debugfs enable/disable. The disable sequence in
> > bspec
> > calls out a running 

Re: [Intel-gfx] [PATCH v2 4/6] drm/i915: Disable PSR when a PSR aux error happen

2018-10-19 Thread Souza, Jose
On Fri, 2018-10-19 at 16:14 -0700, Dhinakaran Pandiyan wrote:
> On Wed, 2018-10-10 at 17:41 -0700, José Roberto de Souza wrote:
> > While PSR is active hardware will do aux transactions by it self to
> > wakeup sink to receive a new frame when necessary. If that
> > transaction is not acked by sink, hardware will trigger this
> > interruption.
> > 
> > So let's disable PSR as it is a hint that there is problem with
> > this
> > sink.
> > 
> > The removed FIXME was asking to manually train the link but we
> > don't
> > need to do that as by spec sink should do a short pulse when it is
> > out of sync with source, we just need to make sure it is awaken and
> > the SDP header with PSR disable will trigger this condition.
> > 
> > Cc: Dhinakaran Pandiyan 
> > Signed-off-by: José Roberto de Souza 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h  |  1 +
> >  drivers/gpu/drm/i915/intel_psr.c | 39 
> > 
> >  2 files changed, 36 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 3017ef037fed..e8ba00dd2c51 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -638,6 +638,7 @@ struct i915_psr {
> > u8 sink_sync_latency;
> > ktime_t last_entry_attempt;
> > ktime_t last_exit;
> > +   u32 irq_aux_error;
> >  };
> >  
> >  enum intel_pch {
> > diff --git a/drivers/gpu/drm/i915/intel_psr.c
> > b/drivers/gpu/drm/i915/intel_psr.c
> > index 70d4e26e17b5..ad09130cb4ad 100644
> > --- a/drivers/gpu/drm/i915/intel_psr.c
> > +++ b/drivers/gpu/drm/i915/intel_psr.c
> > @@ -159,10 +159,16 @@ void intel_psr_irq_handler(struct
> > drm_i915_private *dev_priv, u32 psr_iir)
> >BIT(TRANSCODER_C);
> >  
> > for_each_cpu_transcoder_masked(dev_priv, cpu_transcoder,
> > transcoders) {
> > -   /* FIXME: Exit PSR and link train manually when this
> > happens. */
> > -   if (psr_iir & EDP_PSR_ERROR(cpu_transcoder))
> > -   DRM_DEBUG_KMS("[transcoder %s] PSR aux
> > error\n",
> > - transcoder_name(cpu_transcoder));
> > +   if (psr_iir & EDP_PSR_ERROR(cpu_transcoder)) {
> > +   DRM_WARN("[transcoder %s] PSR aux error\n",
> > +transcoder_name(cpu_transcoder));
> Downgrade this to debug since the error is handled in the driver? 

I think is better keep as DRM_WARN so it is shown in regular kernel
logs this way if a user opens a bug complaning why PSR is disabled we
can check that is because of PSR aux error.

> 
> > +
> > +   spin_lock(_priv->irq_lock);
> > +   dev_priv->psr.irq_aux_error |=
> > BIT(cpu_transcoder);
> Just ignore the non eDP bits, I don't think we want to do anything
> with
> the information that some other bit was set.
> 
> > +   spin_unlock(_priv->irq_lock);
> > +
> > +   schedule_work(_priv->psr.work);
> > +   }
> >  
> > if (psr_iir & EDP_PSR_PRE_ENTRY(cpu_transcoder)) {
> > dev_priv->psr.last_entry_attempt = time_ns;
> > @@ -893,11 +899,36 @@ int intel_psr_set_debugfs_mode(struct
> > drm_i915_private *dev_priv,
> > return ret;
> >  }
> >  
> > +static void intel_psr_handle_irq(struct drm_i915_private
> > *dev_priv)
> > +{
> > +   struct i915_psr *psr = _priv->psr;
> > +   u32 irq_aux_error;
> > +
> > +   spin_lock_irq(_priv->irq_lock);
> > +   irq_aux_error = psr->irq_aux_error;
> > +   psr->irq_aux_error = 0;
> A subsequent modeset will enable PSR again. I don't expect a modeset
> to
> to be able to fix an AUX wake up failure, so might as well disable it
> for good.

Add another field to do that or set sink_support=false? I guess PSR
short pulses errors should also disable it good too?

> 
> > +   spin_unlock_irq(_priv->irq_lock);
> > +
> > +   /* right now PSR is only enabled in eDP */
> "right now" implies that PSR could be enabled for non eDP ports, but
> that's not the case.
> 
> 
> > +   WARN_ON(irq_aux_error & ~BIT(TRANSCODER_EDP));
> This should go away if you ignore non-EDP bits, and a stack trace
> isn't
> particularly useful anyway.

Okay I will remove this handlings for other transcoders.

> 
> > +
> > +   mutex_lock(>lock);
> Is this sufficient? Don't we have to serialize against ongoing
> modesets
> like we do for debugfs enable/disable. The disable sequence in bspec
> calls out a running pipe and port as pre-requisites.

HW will only send a aux transaction when exiting PSR, in this cases
pipe will always be running:
- exiting because of changes in the screen
- exiting because pipe will be disabled
- exiting because of PSR error

> 
> Ccing Ville and Maarten to get their opinion on this.
>  
> > +
> > +   intel_psr_disable_locked(psr->dp);
> > +   /* let's make sure that sink is awaken */
> > +   drm_dp_dpcd_writeb(>dp->aux, DP_SET_POWER,
> > DP_SET_POWER_D0);
> Given that the hardware initiated AUX 

Re: [Intel-gfx] [PATCH v2 4/6] drm/i915: Disable PSR when a PSR aux error happen

2018-10-19 Thread Dhinakaran Pandiyan
On Wed, 2018-10-10 at 17:41 -0700, José Roberto de Souza wrote:
> While PSR is active hardware will do aux transactions by it self to
> wakeup sink to receive a new frame when necessary. If that
> transaction is not acked by sink, hardware will trigger this
> interruption.
> 
> So let's disable PSR as it is a hint that there is problem with this
> sink.
> 
> The removed FIXME was asking to manually train the link but we don't
> need to do that as by spec sink should do a short pulse when it is
> out of sync with source, we just need to make sure it is awaken and
> the SDP header with PSR disable will trigger this condition.
> 
> Cc: Dhinakaran Pandiyan 
> Signed-off-by: José Roberto de Souza 
> ---
>  drivers/gpu/drm/i915/i915_drv.h  |  1 +
>  drivers/gpu/drm/i915/intel_psr.c | 39 
> 
>  2 files changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index 3017ef037fed..e8ba00dd2c51 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -638,6 +638,7 @@ struct i915_psr {
>   u8 sink_sync_latency;
>   ktime_t last_entry_attempt;
>   ktime_t last_exit;
> + u32 irq_aux_error;
>  };
>  
>  enum intel_pch {
> diff --git a/drivers/gpu/drm/i915/intel_psr.c
> b/drivers/gpu/drm/i915/intel_psr.c
> index 70d4e26e17b5..ad09130cb4ad 100644
> --- a/drivers/gpu/drm/i915/intel_psr.c
> +++ b/drivers/gpu/drm/i915/intel_psr.c
> @@ -159,10 +159,16 @@ void intel_psr_irq_handler(struct
> drm_i915_private *dev_priv, u32 psr_iir)
>  BIT(TRANSCODER_C);
>  
>   for_each_cpu_transcoder_masked(dev_priv, cpu_transcoder,
> transcoders) {
> - /* FIXME: Exit PSR and link train manually when this
> happens. */
> - if (psr_iir & EDP_PSR_ERROR(cpu_transcoder))
> - DRM_DEBUG_KMS("[transcoder %s] PSR aux
> error\n",
> -   transcoder_name(cpu_transcoder));
> + if (psr_iir & EDP_PSR_ERROR(cpu_transcoder)) {
> + DRM_WARN("[transcoder %s] PSR aux error\n",
> +  transcoder_name(cpu_transcoder));
Downgrade this to debug since the error is handled in the driver? 

> +
> + spin_lock(_priv->irq_lock);
> + dev_priv->psr.irq_aux_error |=
> BIT(cpu_transcoder);
Just ignore the non eDP bits, I don't think we want to do anything with
the information that some other bit was set.

> + spin_unlock(_priv->irq_lock);
> +
> + schedule_work(_priv->psr.work);
> + }
>  
>   if (psr_iir & EDP_PSR_PRE_ENTRY(cpu_transcoder)) {
>   dev_priv->psr.last_entry_attempt = time_ns;
> @@ -893,11 +899,36 @@ int intel_psr_set_debugfs_mode(struct
> drm_i915_private *dev_priv,
>   return ret;
>  }
>  
> +static void intel_psr_handle_irq(struct drm_i915_private *dev_priv)
> +{
> + struct i915_psr *psr = _priv->psr;
> + u32 irq_aux_error;
> +
> + spin_lock_irq(_priv->irq_lock);
> + irq_aux_error = psr->irq_aux_error;
> + psr->irq_aux_error = 0;
A subsequent modeset will enable PSR again. I don't expect a modeset to
to be able to fix an AUX wake up failure, so might as well disable it
for good.

> + spin_unlock_irq(_priv->irq_lock);
> +
> + /* right now PSR is only enabled in eDP */
"right now" implies that PSR could be enabled for non eDP ports, but
that's not the case.


> + WARN_ON(irq_aux_error & ~BIT(TRANSCODER_EDP));
This should go away if you ignore non-EDP bits, and a stack trace isn't
particularly useful anyway.

> +
> + mutex_lock(>lock);
Is this sufficient? Don't we have to serialize against ongoing modesets
like we do for debugfs enable/disable. The disable sequence in bspec
calls out a running pipe and port as pre-requisites.

Ccing Ville and Maarten to get their opinion on this.
 
> +
> + intel_psr_disable_locked(psr->dp);
> + /* let's make sure that sink is awaken */
> + drm_dp_dpcd_writeb(>dp->aux, DP_SET_POWER,
> DP_SET_POWER_D0);
Given that the hardware initiated AUX write failed, I would recommend
reading back the sink PSR status to make sure disable worked.

> +
> + mutex_unlock(_priv->psr.lock);
> +}
> +
>  static void intel_psr_work(struct work_struct *work)
>  {
>   struct drm_i915_private *dev_priv =
>   container_of(work, typeof(*dev_priv), psr.work);
>  
> + if (READ_ONCE(dev_priv->psr.irq_aux_error))
> + intel_psr_handle_irq(dev_priv);
If psr_work() was already executing and past this check,
schedule_work() in intel_psr_irq_handler will return a failure and
disable PSR would now depend on getting an invalidate and flush
operation. We should disable PSR without any dependency on flush or
invalidate.


> +
>   mutex_lock(_priv->psr.lock);
>  
>   if (!dev_priv->psr.enabled)

___