On 26/11/2014 14:42, Daniel Vetter wrote:
On Wed, Nov 26, 2014 at 3:12 PM, John Harrison
<john.c.harri...@intel.com> wrote:
diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 8bfdac6..831fae2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3130,4 +3130,11 @@ wait_remaining_ms_from_jiffies(unsigned long
timestamp_jiffies, int to_wait_ms)
         }
   }
   +static inline void i915_trace_irq_get(struct intel_engine_cs *ring,
+                                     struct drm_i915_gem_request *req)
+{
+       if (ring->trace_irq_req == NULL && ring->irq_get(ring))
+               i915_gem_request_assign(&ring->trace_irq_req, req);
This looks a bit suspiciuos. Thus far ring->trace_irq_req was essentially
ot protected at all by anything. But that was nothing troublesome since we
didn't hang a real resource of it.

But now there's a refcounted request in that pointer, which means if we
race we leak. I'll skip this patch for now.
-Daniel

Race how? The assignment only ever occurs from inside execbuffer submission
at which point the driver mutex lock is held. Therefore it is very
definitely protected. Not doing the reference count means that there is now
the possibility of a dangling pointer and thus the possibility of going bang
with a kernel oops.
Hm, ->trace_irq_seqno is indeed always touched from the a calling
context with dev->struct_mutex held. Somehow I've misrembered that
since the realtime/tracing folks are really freaked out about what
we're doing here. But from that pov your patch doesn't really make
things worse, so I'll pull it in.

Btw I don't see the oops really without this patch. What would blow up?
-Daniel

The sole access (and clear to null) of the trace pointer is done from retire requests after the requests have been retired. Thus the request structure could have just been freed immediately before it is used. The code could be re-ordered to be safer but I'm not entirely sure what the trace pointer is for or what it might potentially be used for in the future. With the reference counting, the ordering is irrelevant. If the pointer exists then it is safe to use.

The point is that anywhere that keeps a copy of a request pointer really should reference count that copy. Otherwise there is the possibility that the pointer could become stale. Either now or with future code changes. If the copy is always done with the request_assign() function then the pointer is guaranteed safe for all time.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to