There's an entire pile of issues in here:

- Use the main RING_HEAD register, not ACTHD. ACTHD points at the gtt
  offset of the batch buffer when a batch is executed. Semaphores are
  always emitted to the main ring, so we always want to look at that.

- Mask the obtained HEAD pointer with the actual ring size, which is
  much smaller. Together with the above issue this resulted us in
  trying to dereference a pointer way outside of the ring mmio
  mapping. The resulting invalid access in interrupt context
  (hangcheck is executed from timers) lead to a full blown kernel
  panic. The fbcon panic handler then tried to frob our driver harder,
  resulting in a full machine hang at least on my snb here where I've
  stumbled over this.

- Handle ring wrapping correctly and be a bit more explicit about how
  many dwords we're scanning. We probably should also scan more than
  just 4 ...

- Space out some of teh computations for readability.

This prevents hard-hangs on my snb here. Verdict from QA is still
pending, but the symptoms match.

Cc: Mika Kuoppala <mika.kuopp...@intel.com>
Cc: Ben Widawsky <b...@bwidawsk.net>
Cc: Chris Wilson <ch...@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74100
Signed-off-by: Daniel Vetter <daniel.vet...@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_irq.c | 38 ++++++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index be2713f12e08..473372a6c97d 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2529,29 +2529,43 @@ static struct intel_ring_buffer *
 semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
 {
        struct drm_i915_private *dev_priv = ring->dev->dev_private;
-       u32 cmd, ipehr, acthd, acthd_min;
+       u32 cmd, ipehr, head;
+       int i;
 
        ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
        if ((ipehr & ~(0x3 << 16)) !=
            (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE | MI_SEMAPHORE_REGISTER))
                return NULL;
 
-       /* ACTHD is likely pointing to the dword after the actual command,
-        * so scan backwards until we find the MBOX.
+       /*
+        * HEAD is likely pointing to the dword after the actual command,
+        * so scan backwards until we find the MBOX. But limit it to just 3
+        * dwords. Note that we don't care about ACTHD here since that might
+        * point at at batch, and semaphores are always emitted into the
+        * ringbuffer itself.
         */
-       acthd = intel_ring_get_active_head(ring) & HEAD_ADDR;
-       acthd_min = max((int)acthd - 3 * 4, 0);
-       do {
-               cmd = ioread32(ring->virtual_start + acthd);
+       head = I915_READ_HEAD(ring) & HEAD_ADDR;
+
+       for (i = 4; i; --i) {
+               /*
+                * Be paranoid and presume the hw has gone off into the wild -
+                * our ring is smaller than what the hardware (and hence
+                * HEAD_ADDR) allows. Also handles wrap-around.
+                */
+               head &= ring->size - 1;
+
+               /* This here seems to blow up */
+               cmd = ioread32(ring->virtual_start + head);
                if (cmd == ipehr)
                        break;
 
-               acthd -= 4;
-               if (acthd < acthd_min)
-                       return NULL;
-       } while (1);
+               head -= 4;
+       }
+
+       if (!i)
+               return NULL;
 
-       *seqno = ioread32(ring->virtual_start+acthd+4)+1;
+       *seqno = ioread32(ring->virtual_start + head + 4) + 1;
        return &dev_priv->ring[(ring->id + (((ipehr >> 17) & 1) + 1)) % 3];
 }
 
-- 
1.8.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to