skl: Add support for the SAGV, fix underrun hangs

Lyude Paul Thu, 18 Aug 2016 10:05:59 -0400

On Thu, 2016-08-18 at 09:39 +0200, Maarten Lankhorst wrote:
> Hey,
> 
> Op 17-08-16 om 21:55 schreef Lyude:
> > 
> > Since the watermark calculations for Skylake are still broken, we're apt
> > to hitting underruns very easily under multi-monitor configurations.
> > While it would be lovely if this was fixed, it's not. Another problem
> > that's been coming from this however, is the mysterious issue of
> > underruns causing full system hangs. An easy way to reproduce this with
> > a skylake system:
> > 
> > - Get a laptop with a skylake GPU, and hook up two external monitors to
> > Â  it
> > - Move the cursor from the built-in LCD to one of the external displays
> > Â  as quickly as you can
> > - You'll get a few pipe underruns, and eventually the entire system will
> > Â  just freeze.
> > 
> > After doing a lot of investigation and reading through the bspec, I
> > found the existence of the SAGV, which is responsible for adjusting the
> > system agent voltage and clock frequencies depending on how much power
> > we need. According to the bspec:
> > 
> > "The display engine access to system memory is blocked during the
> > Â adjustment time. SAGV defaults to enabled. Software must use the
> > Â GT-driver pcode mailbox to disable SAGV when the display engine is not
> > Â able to tolerate the blocking time."
> > 
> > The rest of the bspec goes on to explain that software can simply leave
> > the SAGV enabled, and disable it when we use interlaced pipes/have more
> > then one pipe active.
> > 
> > Sure enough, with this patchset the system hangs resulting from pipe
> > underruns on Skylake have completely vanished on my T460s. Additionally,
> > the bspec mentions turning off the SAGV     with more then one pipe
> > enabled
> > as a workaround for display underruns. While this patch doesn't entirely
> > fix that, it looks like it does improve the situation a little bit so
> > it's likely this is going to be required to make watermarks on Skylake
> > fully functional.
> > 
> > This will still need additional work in the future: we shouldn't be
> > enabling the SAGV if any of the currently enabled planes can't enable WM
> > levels that introduce latencies >= 30 Âµs.
> > 
> > Changes since v11:
> > Â - Add skl_can_enable_sagv()
> > Â - Make sure we don't enable SAGV when not all planes can enable
> > Â Â Â watermarks >= the SAGV engine block time. I was originally going to
> > Â Â Â save this for later, but I recently managed to run into a machine
> > Â Â Â that was having problems with a single pipe configuration + SAGV.
> > Â - Make comparisons to I915_SKL_SAGV_NOT_CONTROLLED explicit
> > Â - Change I915_SAGV_DYNAMIC_FREQ to I915_SAGV_ENABLE
> > Â - Move printks outside of mutexes
> > Â - Don't print error messages twice
> > Changes since v10:
> > Â - Apparently sandybridge_pcode_read actually writes values and reads
> > Â Â Â them back, despite it's misleading function name. This means we've
> > Â Â Â been doing this mostly wrong and have been writing garbage to the
> > Â Â Â SAGV control. Because of this, we no longer attempt to read the SAGV
> > Â Â Â status during initialization (since there are no helpers for this).
> > Â - mlankhorst noticed that this patch was breaking on some very early
> > Â Â Â pre-release Skylake machines, which apparently don't allow you to
> > Â Â Â disable the SAGV. To prevent machines from failing tests due to SAGV
> > Â Â Â errors, if the first time we try to control the SAGV results in the
> > Â Â Â mailbox indicating an invalid command, we just disable future attempts
> > Â Â Â to control the SAGV state by setting dev_priv->skl_sagv_status to
> > Â Â Â I915_SKL_SAGV_NOT_CONTROLLED and make a note of it in dmesg.
> > Â - Move mutex_unlock() a little higher in skl_enable_sagv(). This
> > Â Â Â doesn't actually fix anything, but lets us release the lock a little
> > Â Â Â sooner since we're finished with it.
> > Changes since v9:
> > Â - Only enable/disable sagv on Skylake
> > Changes since v8:
> > Â - Add intel_state->modeset guard to the conditional for
> > Â Â Â skl_enable_sagv()
> > Changes since v7:
> > Â - Remove GEN9_SAGV_LOW_FREQ, replace with GEN9_SAGV_IS_ENABLED (that's
> > Â Â Â all we use it for anyway)
> > Â - Use GEN9_SAGV_IS_ENABLED instead of 0x1 for clarification
> > Â - Fix a styling error that snuck past me
> > Changes since v6:
> > Â - Protect skl_enable_sagv() with intel_state->modeset conditional in
> > Â Â Â intel_atomic_commit_tail()
> > Changes since v5:
> > Â - Don't use is_power_of_2. Makes things confusing
> > Â - Don't use the old state to figure out whether or not to
> > Â Â Â enable/disable the sagv, use the new one
> > Â - Split the loop in skl_disable_sagv into it's own function
> > Â - Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail()
> > Changes since v4:
> > Â - Use is_power_of_2 against active_crtcs to check whether we have > 1
> > Â Â Â pipe enabled
> > Â - Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0
> > Â Â Â enabled
> > Â - Call skl_sagv_enable/disable() from pre/post-plane updates
> > Changes since v3:
> > Â - Use time_before() to compare timeout to jiffies
> > Changes since v2:
> > Â - Really apply minor style nitpicks to patch this time
> > Changes since v1:
> > Â - Added comments about this probably being one of the requirements to
> > Â Â Â fixing Skylake's watermark issues
> > Â - Minor style nitpicks from Matt Roper
> > Â - Disable these functions on Broxton, since it doesn't have an SAGV
> > 
> > Signed-off-by: Lyude <cpaul at redhat.com>
> > Cc: Matt Roper <matthew.d.roper at intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > Cc: Ville SyrjÃ¤lÃ¤ <ville.syrjala at linux.intel.com>
> > Cc: stable at vger.kernel.org
> > ---
> > Â drivers/gpu/drm/i915/i915_drv.hÂ Â Â Â Â Â |Â Â Â 7 ++
> > Â drivers/gpu/drm/i915/i915_reg.hÂ Â Â Â Â Â |Â Â Â 4 +
> > Â drivers/gpu/drm/i915/intel_display.c |Â Â 11 +++
> > Â drivers/gpu/drm/i915/intel_drv.hÂ Â Â Â Â |Â Â Â 3 +
> > Â drivers/gpu/drm/i915/intel_pm.cÂ Â Â Â Â Â | 148
> > +++++++++++++++++++++++++++++++++++
> > Â 5 files changed, 173 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 35caa9b..f20530bb 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1949,6 +1949,13 @@ struct drm_i915_private {
> > Â   struct i915_suspend_saved_registers regfile;
> > Â   struct vlv_s0ix_state vlv_s0ix_state;
> > Â 
> > +   enum {
> > +           I915_SKL_SAGV_UNKNOWN = 0,
> > +           I915_SKL_SAGV_DISABLED,
> > +           I915_SKL_SAGV_ENABLED,
> > +           I915_SKL_SAGV_NOT_CONTROLLED
> > +   } skl_sagv_status;
> > +
> > Â   struct {
> > Â           /*
> > Â           Â * Raw watermark latency values:
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index 7419fbf..be82c49 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -7153,6 +7153,10 @@ enum {
> > Â #defineÂ Â Â HSW_PCODE_DE_WRITE_FREQ_REQ          0x17
> > Â #defineÂ Â Â DISPLAY_IPS_CONTROL                  0x19
> > Â #define   Â Â HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL        0x1A
> > +#defineÂ Â Â GEN9_PCODE_SAGV_CONTROL               0x21
> > +#defineÂ Â Â Â Â GEN9_SAGV_DISABLE                 0x0
> > +#defineÂ Â Â Â Â GEN9_SAGV_IS_DISABLED             0x1
> > +#defineÂ Â Â Â Â GEN9_SAGV_ENABLEÂ         Â Â Â Â Â Â Â Â Â Â Â Â Â 0x3
> > Â #define GEN6_PCODE_DATA                           _MMIO(0x138128)
> > Â #defineÂ Â Â GEN6_PCODE_FREQ_IA_RATIO_SHIFT       8
> > Â #defineÂ Â Â GEN6_PCODE_FREQ_RING_RATIO_SHIFT     16
> > diff --git a/drivers/gpu/drm/i915/intel_display.c
> > b/drivers/gpu/drm/i915/intel_display.c
> > index 781d22e..ca4b83f 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -14156,6 +14156,13 @@ static void intel_atomic_commit_tail(struct
> > drm_atomic_state *state)
> > Â           Â Â Â Â Â intel_state->cdclk_pll_vco != dev_priv-
> > >cdclk_pll.vco))
> > Â                   dev_priv->display.modeset_commit_cdclk(state);
> > Â 
> > +           /*
> > +           Â * SKL workaround: bspec recommends we disable the SAGV
> > when we
> > +           Â * have more then one pipe enabled
> > +           Â */
> > +           if (IS_SKYLAKE(dev_priv) && !skl_can_enable_sagv(state))
> > +                   skl_disable_sagv(dev_priv);
> > +
> > Â           intel_modeset_verify_disabled(dev);
> > Â   }
> > Â 
> > @@ -14229,6 +14236,10 @@ static void intel_atomic_commit_tail(struct
> > drm_atomic_state *state)
> > Â           intel_modeset_verify_crtc(crtc, old_crtc_state, crtc-
> > >state);
> > Â   }
> > Â 
> > +   if (IS_SKYLAKE(dev_priv) && intel_state->modeset &&
> > +   Â Â Â Â skl_can_enable_sagv(state))
> > +           skl_enable_sagv(dev_priv);
> > +
> > Â   drm_atomic_helper_commit_hw_done(state);
> > Â 
> > Â   if (intel_state->modeset)
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h
> > b/drivers/gpu/drm/i915/intel_drv.h
> > index 1c700b0..d203c77 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -1722,6 +1722,9 @@ void ilk_wm_get_hw_state(struct drm_device *dev);
> > Â void skl_wm_get_hw_state(struct drm_device *dev);
> > Â void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > Â                   Â Â struct skl_ddb_allocation *ddb /* out */);
> > +bool skl_can_enable_sagv(struct drm_atomic_state *state);
> > +int skl_enable_sagv(struct drm_i915_private *dev_priv);
> > +int skl_disable_sagv(struct drm_i915_private *dev_priv);
> > Â uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config);
> > Â bool ilk_disable_lp_wm(struct drm_device *dev);
> > Â int sanitize_rc6_option(struct drm_i915_private *dev_priv, int 
> > enable_rc6);
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index b4cf988..fed2bae8 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2860,6 +2860,7 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
> > Â 
> > Â #define SKL_DDB_SIZE              896     /* in blocks */
> > Â #define BXT_DDB_SIZE              512
> > +#define SKL_SAGV_BLOCK_TIME 30 /* Âµs */
> > Â 
> > Â /*
> > Â  * Return the index of a plane in the SKL DDB and wm result
> > arrays.Â Â Primary
> > @@ -2883,6 +2884,153 @@ skl_wm_plane_id(const struct intel_plane *plane)
> > Â   }
> > Â }
> > Â 
> > +/*
> > + * SAGV dynamically adjusts the system agent voltage and clock frequencies
> > + * depending on power and performance requirements. The display engine
> > access
> > + * to system memory is blocked during the adjustment time. Because of the
> > + * blocking time, having this enabled can cause full system hangs and/or
> > pipe
> > + * underruns if we don't meet all of the following requirements:
> > + *
> > + *Â Â - <= 1 pipe enabled
> > + *Â Â - All planes can enable watermarks for latencies >= SAGV engine block
> > time
> > + *Â Â - We're not using an interlaced display configuration
> > + */
> > +int
> > +skl_enable_sagv(struct drm_i915_private *dev_priv)
> > +{
> > +   int ret;
> > +
> > +   if (dev_priv->skl_sagv_status == I915_SKL_SAGV_NOT_CONTROLLED ||
> > +   Â Â Â Â dev_priv->skl_sagv_status == I915_SKL_SAGV_ENABLED)
> > +           return 0;
> > +
> > +   DRM_DEBUG_KMS("Enabling the SAGV\n");
> > +   mutex_lock(&dev_priv->rps.hw_lock);
> > +
> > +   ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL,
> > +                           Â Â Â Â Â Â GEN9_SAGV_ENABLE);
> > +
> > +   /* We don't need to wait for the SAGV when enabling */
> > +   mutex_unlock(&dev_priv->rps.hw_lock);
> > +
> > +   /*
> > +   Â * Some skl systems, pre-release machines in particular,
> > +   Â * don't actually have an SAGV.
> > +   Â */
> > +   if (ret == -ENOSYS) {
> > +           DRM_DEBUG_DRIVER("No SAGV found on system, ignoring\n");
> > +           dev_priv->skl_sagv_status = I915_SKL_SAGV_NOT_CONTROLLED;
> > +           return 0;
> > +   } else if (ret < 0) {
> > +           DRM_ERROR("Failed to enable the SAGV\n");
> > +           return ret;
> > +   }
> > +
> > +   dev_priv->skl_sagv_status = I915_SKL_SAGV_ENABLED;
> > +   return 0;
> > +}
> > +
> > +static int
> > +skl_do_sagv_disable(struct drm_i915_private *dev_priv)
> > +{
> > +   int ret;
> > +   uint32_t temp = GEN9_SAGV_DISABLE;
> > +
> > +   ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL,
> > +                           Â Â Â Â Â &temp);
> > +   if (ret)
> > +           return ret;
> > +   else
> > +           return temp & GEN9_SAGV_IS_DISABLED;
> > +}
> > +
> > +int
> > +skl_disable_sagv(struct drm_i915_private *dev_priv)
> > +{
> > +   int ret, result;
> > +
> > +   if (dev_priv->skl_sagv_status == I915_SKL_SAGV_NOT_CONTROLLED ||
> > +   Â Â Â Â dev_priv->skl_sagv_status == I915_SKL_SAGV_DISABLED)
> > +           return 0;
> > +
> > +   DRM_DEBUG_KMS("Disabling the SAGV\n");
> > +   mutex_lock(&dev_priv->rps.hw_lock);
> > +
> > +   /* bspec says to keep retrying for at least 1 ms */
> > +   ret = wait_for(result = skl_do_sagv_disable(dev_priv), 1);
> > +   mutex_unlock(&dev_priv->rps.hw_lock);
> > +
> > +   if (ret == -ETIMEDOUT) {
> > +           DRM_ERROR("Request to disable SAGV timed out\n");
> > +           return -ETIMEDOUT;
> > +   }
> > +
> > +   /*
> > +   Â * Some skl systems, pre-release machines in particular,
> > +   Â * don't actually have an SAGV.
> > +   Â */
> > +   if (result == -ENOSYS) {
> > +           DRM_DEBUG_DRIVER("No SAGV found on system, ignoring\n");
> > +           dev_priv->skl_sagv_status = I915_SKL_SAGV_NOT_CONTROLLED;
> > +           return 0;
> > +   } else if (result < 0) {
> > +           DRM_ERROR("Failed to disable the SAGV\n");
> > +           return result;
> > +   }
> > +
> > +   dev_priv->skl_sagv_status = I915_SKL_SAGV_DISABLED;
> > +   return 0;
> > +}
> > +
> > +bool skl_can_enable_sagv(struct drm_atomic_state *state)
> > +{
> > +   struct drm_device *dev = state->dev;
> > +   struct drm_i915_private *dev_priv = to_i915(dev);
> > +   struct intel_atomic_state *intel_state =
> > to_intel_atomic_state(state);
> > +   struct drm_crtc *crtc;
> > +   enum pipe pipe;
> > +   int level, plane;
> > +
> > +   /*
> > +   Â * SKL workaround: bspec recommends we disable the SAGV when we
> > have
> > +   Â * more then one pipe enabled
> > +   Â *
> > +   Â * If there are no active CRTCs, no additional checks need be
> > performed
> > +   Â */
> > +   if (hweight32(intel_state->active_crtcs) == 0)
> > +           return true;
> > +   else if (hweight32(intel_state->active_crtcs) > 1)
> > +           return false;
> > +
> > +   /* Since we're now guaranteed to only have one active CRTC... */
> > +   pipe = ffs(intel_state->active_crtcs) - 1;
> > +   crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> > +
> > +   if (crtc->state->mode.flags & DRM_MODE_FLAG_INTERLACE)
> > +           return false;
> > +
> > +   for_each_plane(dev_priv, pipe, plane) {
> > +           /* Skip this plane if it's not enabled */
> > +           if (intel_state->wm_results.plane[pipe][plane][0] == 0)
> > +                   continue;
> > +
> > +           /* Find the highest enabled wm level for this plane */
> > +           for (level = ilk_wm_max_level(dev);
> > +           Â Â Â Â Â intel_state->wm_results.plane[pipe][plane][level] ==
> > 0;
> > +           Â Â Â Â Â --level);
> > +
> > +           /*
> > +           Â * If any of the planes on this pipe don't enable wm levels
> > +           Â * that incur memory latencies higher then 30Âµs we can't
> > enable
> > +           Â * the SAGV
> > +           Â */
> > +           if (dev_priv->wm.skl_latency[level] < SKL_SAGV_BLOCK_TIME)
> > +                   return false;
> Shouldn't this check be >= BLOCK_TIME?
> 
That's the requirement for the sagv but the conditional here is still correct.


WM0 - 2ms
WM1 - 5ms
WM2 - 10ms
WM3 - 20ms
WM4+- disabled

(20ms < BLOCK_TIME) == true, which indicates we don't have any watermark levels
with latency values >= 30ms. We can't enable so return false.

WM0 - 2ms
WM1 - 5ms
WM2 - 10ms
WM3 - 20ms
WM4 - 33ms
WM5 - 50ms
WM6 - 70ms
WM7 - 99ms

(99ms < BLOCK_TIME) == false, and 99 >= BLOCK_TIME so we end up returning true
to indicate it's safe to enable.

> ~Maarten
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v12 2/7] drm/i915/skl: Add support for the SAGV, fix underrun hangs

Reply via email to