Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-08-03 Thread Tvrtko Ursulin


On 07/28/2015 11:10 AM, John Harrison wrote:
[snip]

  static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct
drm_i915_gem_request *req)
  return;

  dev = req-ring-dev;
-if (kref_put_mutex(req-ref, i915_gem_request_free,
dev-struct_mutex))
+if (kref_put_mutex(req-fence.refcount, fence_release,
dev-struct_mutex))
  mutex_unlock(dev-struct_mutex);


Would it be nicer to add fence_put_mutex(struct fence *, struct mutex
*) for this? It would avoid the layering violation of requests peeking
into fence implementation details.


Maarten pointed out that adding 'fence_put_mutex()' is breaking the
fence ABI itself. It requires users of any random fence to know and
worry about what mutex objects that fence's internal implementation
might require.


I don't see what it has to do with the ABI? I dislike both approaches 
actually and don't like the question of what is the lesser evil. :)



This is a bit more hacky from our point of view but it is only a
temporary measure until the mutex lock is not required for
dereferencing. At that point all the nasty stuff disappears completely.


Yes saw that in later patches. No worries then, just a consequence of 
going patch by patch.



+
  if (req-file_priv)
  i915_gem_request_remove_from_client(req);

@@ -2637,6 +2639,47 @@ void i915_gem_request_free(struct kref *req_ref)
  kmem_cache_free(req-i915-requests, req);
  }

+static const char *i915_gem_request_get_driver_name(struct fence
*req_fence)
+{
+return i915_request;
+}


I think this becomes kind of ABI once added so we need to make sure
the best name is chosen to start with. I couldn't immediately figure
out why not just i915?


The thought was that we might start using fences for other purposes at
some point in the future. At which point it is helpful to differentiate
them. If nothing else, a previous iteration of the native sync
implementation was using different fence objects due to worries about
allowing user land to get its grubby mitts on important driver internal
structures.


I don't follow on the connection between these names and the last 
concern? If I did, would it explain why driver name i915 and 
differentiation by timeline names would be problematic? Looks much 
cleaner to me.



+
+static const char *i915_gem_request_get_timeline_name(struct fence
*req_fence)
+{
+struct drm_i915_gem_request *req = container_of(req_fence,
+ typeof(*req), fence);
+return req-ring-name;
+}
+
+static bool i915_gem_request_enable_signaling(struct fence *req_fence)
+{
+/* Interrupt driven fences are not implemented yet.*/
+WARN(true, This should not be called!);
+return true;


I suppose WARN is not really needed in the interim patch. Would return
false work?


The point is to keep the driver 'safe' if that patch is viewed as stand
alone. I.e., if the interrupt follow up does not get merged immediately
after (or not at all) then this prevents people accidentally trying to
use an unsupported interface without first implementing it.


I assumed true means signaling enabled successfully but false 
actually means fence already passed so you are right. I blame the 
un-intuitive API. :)



+fence_base = fence_context_alloc(I915_NUM_RINGS);
+
  /* Now it is safe to go back round and do everything else: */
  for_each_ring(ring, dev_priv, i) {
  struct drm_i915_gem_request *req;

  WARN_ON(!ring-default_context);

+ring-fence_context = fence_base + i;


Could you store fence_base in dev_priv and then ring-init_hw could
set up the fence_context on its own?


Probably. It seemed to make more sense to me to keep the fence
allocation all in one place rather than splitting it in half and
distributing it. It gets removed again with the per context per ring
timelines anyway.


Yes saw that later, never mind then.

Tvrtko





___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-08-03 Thread Tvrtko Ursulin


On 07/28/2015 11:18 AM, John Harrison wrote:

On 22/07/2015 15:45, Tvrtko Ursulin wrote:


On 07/17/2015 03:31 PM, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

There is a construct in the linux kernel called 'struct fence' that
is intended
to keep track of work that is executed on hardware. I.e. it solves
the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to
address. The
request structure does quite a lot more than simply track the
execution progress
so is very definitely still required. However, the basic completion
status side
could be updated to use the ready made fence implementation and gain
all the
advantages that provides.

This patch makes the first step of integrating a struct fence into
the request.
It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent.
Currently, that
simply chains on to the original request implementation. A future
patch will
improve this.

For: VIZ-5190
Signed-off-by: John Harrison john.c.harri...@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h | 45 +
  drivers/gpu/drm/i915/i915_gem.c | 58
++---
  drivers/gpu/drm/i915/intel_lrc.c|  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
  #include linux/intel-iommu.h
  #include linux/kref.h
  #include linux/pm_qos.h
+#include linux/fence.h

  /* General customization:
   */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct
drm_i915_gem_object *old,
   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
-struct kref ref;
+/**
+ * Underlying object for implementing the signal/wait stuff.
+ * NB: Never call fence_later() or return this fence object to user
+ * land! Due to lazy allocation, scheduler re-ordering,
pre-emption,
+ * etc., there is no guarantee at all about the validity or
+ * sequentiality of the fence's seqno! It is also unsafe to let
+ * anything outside of the i915 driver get hold of the fence object
+ * as the clean up when decrementing the reference count requires
+ * holding the driver mutex lock.
+ */
+struct fence fence;

  /** On Which ring this request was generated */
  struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct
intel_engine_cs *ring,
 struct intel_context *ctx,
 struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct
drm_i915_gem_request *req,
+  bool lazy_coherency)
+{
+return fence_is_signaled(req-fence);
+}
+
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
 struct drm_file *file);

@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
  if (req)
-kref_get(req-ref);
+fence_get(req-fence);
  return req;
  }

@@ -2255,7 +2272,7 @@ static inline void
  i915_gem_request_unreference(struct drm_i915_gem_request *req)
  {
WARN_ON(!mutex_is_locked(req-ring-dev-struct_mutex));
-kref_put(req-ref, i915_gem_request_free);
+fence_put(req-fence);
  }

  static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct
drm_i915_gem_request *req)
  return;

  dev = req-ring-dev;
-if (kref_put_mutex(req-ref, i915_gem_request_free,
dev-struct_mutex))
+if (kref_put_mutex(req-fence.refcount, fence_release,
dev-struct_mutex))
  mutex_unlock(dev-struct_mutex);
  }

@@ -2284,12 +2301,6 @@ static inline void
i915_gem_request_assign(struct drm_i915_gem_request **pdst,
  }

  /*
- * XXX: i915_gem_request_completed should be here but currently
needs the
- * definition of i915_seqno_passed() which is below. It will be
moved in
- * a later patch when the call to i915_seqno_passed() is obsoleted...
- */
-
-/*
   * A command that requires special handling by the command parser.
   */
  struct drm_i915_cmd_descriptor {
@@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
  return (int32_t)(seq1 - seq2) = 0;
  }

-static inline bool i915_gem_request_completed(struct
drm_i915_gem_request *req,
-  bool lazy_coherency)
-{
-u32 seqno;
-
-BUG_ON(req == NULL);
-
-seqno = req-ring-get_seqno(req-ring, lazy_coherency);
-
-return i915_seqno_passed(seqno, req-seqno);
-}
-
  int __must_check i915_gem_get_seqno(struct drm_device 

Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-07-28 Thread John Harrison

On 21/07/2015 08:05, Daniel Vetter wrote:

On Fri, Jul 17, 2015 at 03:31:17PM +0100, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

There is a construct in the linux kernel called 'struct fence' that is intended
to keep track of work that is executed on hardware. I.e. it solves the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to address. The
request structure does quite a lot more than simply track the execution progress
so is very definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain all the
advantages that provides.

This patch makes the first step of integrating a struct fence into the request.
It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent. Currently, that
simply chains on to the original request implementation. A future patch will
improve this.

For: VIZ-5190
Signed-off-by: John Harrison john.c.harri...@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h | 45 +
  drivers/gpu/drm/i915/i915_gem.c | 58 ++---
  drivers/gpu/drm/i915/intel_lrc.c|  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
  #include linux/intel-iommu.h
  #include linux/kref.h
  #include linux/pm_qos.h
+#include linux/fence.h
  
  /* General customization:

   */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
-   struct kref ref;
+   /**
+* Underlying object for implementing the signal/wait stuff.
+* NB: Never call fence_later() or return this fence object to user
+* land! Due to lazy allocation, scheduler re-ordering, pre-emption,
+* etc., there is no guarantee at all about the validity or
+* sequentiality of the fence's seqno! It is also unsafe to let
+* anything outside of the i915 driver get hold of the fence object
+* as the clean up when decrementing the reference count requires
+* holding the driver mutex lock.
+*/

This comment is outdated.
Not at this point in the patch series. Until the 'delay freeing of 
requests' patch it is very definitely unsafe to allow external reference 
counting of the fence object as decrementing the count must only be done 
with the mutex lock held. Likewise, without the per context per ring 
timeline, the seqno value would unsafe if any scheduling code was added. 
The comment is written from the point of view of what happens if this 
patch is taken as a stand alone patch without the rest of the series 
following and describes the state of the driver at that moment in time.



Also I'm leaning towards squashing this patch
with the one implementing fences with explicit irq enabling, to avoid
churn and intermediate WARN_ONs. Each patch should be fully functional
without requiring follow-up patches.


It is fully functional as a stand alone patch. The intermediate WARN_ONs 
would never fire unless someone takes only this patch and the starts 
adding extra (unsupported) usage of the fence object. If they want to do 
that then they must first add in the extra support that they need. Or 
they could just take the rest of the series which adds in that support 
for them.


Squashing the interrupt support into the basic fence implementation 
would just make for a much more complicated single patch.




-Daniel



+   struct fence fence;
  
  	/** On Which ring this request was generated */

struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
   struct intel_context *ctx,
   struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
+ bool lazy_coherency)
+{
+   return fence_is_signaled(req-fence);
+}
+
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
   struct drm_file *file);
  
@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *

  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
if (req)
-   kref_get(req-ref);
+   fence_get(req-fence);
return req;
  }
  
@@ -2255,7 +2272,7 @@ static inline void

  i915_gem_request_unreference(struct 

Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-07-28 Thread John Harrison

On 22/07/2015 15:26, Tvrtko Ursulin wrote:


Hi,

On 07/17/2015 03:31 PM, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

There is a construct in the linux kernel called 'struct fence' that 
is intended
to keep track of work that is executed on hardware. I.e. it solves 
the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to 
address. The
request structure does quite a lot more than simply track the 
execution progress
so is very definitely still required. However, the basic completion 
status side
could be updated to use the ready made fence implementation and gain 
all the

advantages that provides.

This patch makes the first step of integrating a struct fence into 
the request.

It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent. 
Currently, that
simply chains on to the original request implementation. A future 
patch will

improve this.

For: VIZ-5190
Signed-off-by: John Harrison john.c.harri...@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h | 45 +
  drivers/gpu/drm/i915/i915_gem.c | 58 
++---

  drivers/gpu/drm/i915/intel_lrc.c|  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
  #include linux/intel-iommu.h
  #include linux/kref.h
  #include linux/pm_qos.h
+#include linux/fence.h

  /* General customization:
   */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct 
drm_i915_gem_object *old,

   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
-struct kref ref;
+/**
+ * Underlying object for implementing the signal/wait stuff.
+ * NB: Never call fence_later() or return this fence object to user
+ * land! Due to lazy allocation, scheduler re-ordering, 
pre-emption,

+ * etc., there is no guarantee at all about the validity or
+ * sequentiality of the fence's seqno! It is also unsafe to let
+ * anything outside of the i915 driver get hold of the fence object
+ * as the clean up when decrementing the reference count requires
+ * holding the driver mutex lock.
+ */
+struct fence fence;

  /** On Which ring this request was generated */
  struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct 
intel_engine_cs *ring,

 struct intel_context *ctx,
 struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct 
drm_i915_gem_request *req,

+  bool lazy_coherency)
+{
+return fence_is_signaled(req-fence);
+}
+
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
 struct drm_file *file);

@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
  if (req)
-kref_get(req-ref);
+fence_get(req-fence);
  return req;
  }

@@ -2255,7 +2272,7 @@ static inline void
  i915_gem_request_unreference(struct drm_i915_gem_request *req)
  {
WARN_ON(!mutex_is_locked(req-ring-dev-struct_mutex));
-kref_put(req-ref, i915_gem_request_free);
+fence_put(req-fence);
  }

  static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct 
drm_i915_gem_request *req)

  return;

  dev = req-ring-dev;
-if (kref_put_mutex(req-ref, i915_gem_request_free, 
dev-struct_mutex))
+if (kref_put_mutex(req-fence.refcount, fence_release, 
dev-struct_mutex))

  mutex_unlock(dev-struct_mutex);


Would it be nicer to add fence_put_mutex(struct fence *, struct mutex 
*) for this? It would avoid the layering violation of requests peeking 
into fence implementation details.


Maarten pointed out that adding 'fence_put_mutex()' is breaking the 
fence ABI itself. It requires users of any random fence to know and 
worry about what mutex objects that fence's internal implementation 
might require.


This is a bit more hacky from our point of view but it is only a 
temporary measure until the mutex lock is not required for 
dereferencing. At that point all the nasty stuff disappears completely.






  }

@@ -2284,12 +2301,6 @@ static inline void 
i915_gem_request_assign(struct drm_i915_gem_request **pdst,

  }

  /*
- * XXX: i915_gem_request_completed should be here but currently 
needs the
- * definition of i915_seqno_passed() which is below. It will be 
moved in

- * a later patch when the call to i915_seqno_passed() is 

Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-07-28 Thread John Harrison

On 22/07/2015 15:45, Tvrtko Ursulin wrote:


On 07/17/2015 03:31 PM, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

There is a construct in the linux kernel called 'struct fence' that 
is intended
to keep track of work that is executed on hardware. I.e. it solves 
the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to 
address. The
request structure does quite a lot more than simply track the 
execution progress
so is very definitely still required. However, the basic completion 
status side
could be updated to use the ready made fence implementation and gain 
all the

advantages that provides.

This patch makes the first step of integrating a struct fence into 
the request.

It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent. 
Currently, that
simply chains on to the original request implementation. A future 
patch will

improve this.

For: VIZ-5190
Signed-off-by: John Harrison john.c.harri...@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h | 45 +
  drivers/gpu/drm/i915/i915_gem.c | 58 
++---

  drivers/gpu/drm/i915/intel_lrc.c|  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
  #include linux/intel-iommu.h
  #include linux/kref.h
  #include linux/pm_qos.h
+#include linux/fence.h

  /* General customization:
   */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct 
drm_i915_gem_object *old,

   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
-struct kref ref;
+/**
+ * Underlying object for implementing the signal/wait stuff.
+ * NB: Never call fence_later() or return this fence object to user
+ * land! Due to lazy allocation, scheduler re-ordering, 
pre-emption,

+ * etc., there is no guarantee at all about the validity or
+ * sequentiality of the fence's seqno! It is also unsafe to let
+ * anything outside of the i915 driver get hold of the fence object
+ * as the clean up when decrementing the reference count requires
+ * holding the driver mutex lock.
+ */
+struct fence fence;

  /** On Which ring this request was generated */
  struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct 
intel_engine_cs *ring,

 struct intel_context *ctx,
 struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct 
drm_i915_gem_request *req,

+  bool lazy_coherency)
+{
+return fence_is_signaled(req-fence);
+}
+
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
 struct drm_file *file);

@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
  if (req)
-kref_get(req-ref);
+fence_get(req-fence);
  return req;
  }

@@ -2255,7 +2272,7 @@ static inline void
  i915_gem_request_unreference(struct drm_i915_gem_request *req)
  {
WARN_ON(!mutex_is_locked(req-ring-dev-struct_mutex));
-kref_put(req-ref, i915_gem_request_free);
+fence_put(req-fence);
  }

  static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct 
drm_i915_gem_request *req)

  return;

  dev = req-ring-dev;
-if (kref_put_mutex(req-ref, i915_gem_request_free, 
dev-struct_mutex))
+if (kref_put_mutex(req-fence.refcount, fence_release, 
dev-struct_mutex))

  mutex_unlock(dev-struct_mutex);
  }

@@ -2284,12 +2301,6 @@ static inline void 
i915_gem_request_assign(struct drm_i915_gem_request **pdst,

  }

  /*
- * XXX: i915_gem_request_completed should be here but currently 
needs the
- * definition of i915_seqno_passed() which is below. It will be 
moved in

- * a later patch when the call to i915_seqno_passed() is obsoleted...
- */
-
-/*
   * A command that requires special handling by the command parser.
   */
  struct drm_i915_cmd_descriptor {
@@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
  return (int32_t)(seq1 - seq2) = 0;
  }

-static inline bool i915_gem_request_completed(struct 
drm_i915_gem_request *req,

-  bool lazy_coherency)
-{
-u32 seqno;
-
-BUG_ON(req == NULL);
-
-seqno = req-ring-get_seqno(req-ring, lazy_coherency);
-
-return i915_seqno_passed(seqno, req-seqno);
-}
-
  int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 

Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-07-22 Thread Tvrtko Ursulin


On 07/17/2015 03:31 PM, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

There is a construct in the linux kernel called 'struct fence' that is intended
to keep track of work that is executed on hardware. I.e. it solves the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to address. The
request structure does quite a lot more than simply track the execution progress
so is very definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain all the
advantages that provides.

This patch makes the first step of integrating a struct fence into the request.
It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent. Currently, that
simply chains on to the original request implementation. A future patch will
improve this.

For: VIZ-5190
Signed-off-by: John Harrison john.c.harri...@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h | 45 +
  drivers/gpu/drm/i915/i915_gem.c | 58 ++---
  drivers/gpu/drm/i915/intel_lrc.c|  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
  #include linux/intel-iommu.h
  #include linux/kref.h
  #include linux/pm_qos.h
+#include linux/fence.h

  /* General customization:
   */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
-   struct kref ref;
+   /**
+* Underlying object for implementing the signal/wait stuff.
+* NB: Never call fence_later() or return this fence object to user
+* land! Due to lazy allocation, scheduler re-ordering, pre-emption,
+* etc., there is no guarantee at all about the validity or
+* sequentiality of the fence's seqno! It is also unsafe to let
+* anything outside of the i915 driver get hold of the fence object
+* as the clean up when decrementing the reference count requires
+* holding the driver mutex lock.
+*/
+   struct fence fence;

/** On Which ring this request was generated */
struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
   struct intel_context *ctx,
   struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
+ bool lazy_coherency)
+{
+   return fence_is_signaled(req-fence);
+}
+
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
   struct drm_file *file);

@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
if (req)
-   kref_get(req-ref);
+   fence_get(req-fence);
return req;
  }

@@ -2255,7 +2272,7 @@ static inline void
  i915_gem_request_unreference(struct drm_i915_gem_request *req)
  {
WARN_ON(!mutex_is_locked(req-ring-dev-struct_mutex));
-   kref_put(req-ref, i915_gem_request_free);
+   fence_put(req-fence);
  }

  static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct 
drm_i915_gem_request *req)
return;

dev = req-ring-dev;
-   if (kref_put_mutex(req-ref, i915_gem_request_free, 
dev-struct_mutex))
+   if (kref_put_mutex(req-fence.refcount, fence_release, 
dev-struct_mutex))
mutex_unlock(dev-struct_mutex);
  }

@@ -2284,12 +2301,6 @@ static inline void i915_gem_request_assign(struct 
drm_i915_gem_request **pdst,
  }

  /*
- * XXX: i915_gem_request_completed should be here but currently needs the
- * definition of i915_seqno_passed() which is below. It will be moved in
- * a later patch when the call to i915_seqno_passed() is obsoleted...
- */
-
-/*
   * A command that requires special handling by the command parser.
   */
  struct drm_i915_cmd_descriptor {
@@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
return (int32_t)(seq1 - seq2) = 0;
  }

-static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
- bool lazy_coherency)
-{
-   u32 seqno;
-
-   BUG_ON(req == NULL);
-
-   seqno = req-ring-get_seqno(req-ring, lazy_coherency);
-
-   return 

Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-07-22 Thread Tvrtko Ursulin


Hi,

On 07/17/2015 03:31 PM, john.c.harri...@intel.com wrote:

From: John Harrison john.c.harri...@intel.com

There is a construct in the linux kernel called 'struct fence' that is intended
to keep track of work that is executed on hardware. I.e. it solves the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to address. The
request structure does quite a lot more than simply track the execution progress
so is very definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain all the
advantages that provides.

This patch makes the first step of integrating a struct fence into the request.
It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent. Currently, that
simply chains on to the original request implementation. A future patch will
improve this.

For: VIZ-5190
Signed-off-by: John Harrison john.c.harri...@intel.com
---
  drivers/gpu/drm/i915/i915_drv.h | 45 +
  drivers/gpu/drm/i915/i915_gem.c | 58 ++---
  drivers/gpu/drm/i915/intel_lrc.c|  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
  #include linux/intel-iommu.h
  #include linux/kref.h
  #include linux/pm_qos.h
+#include linux/fence.h

  /* General customization:
   */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
-   struct kref ref;
+   /**
+* Underlying object for implementing the signal/wait stuff.
+* NB: Never call fence_later() or return this fence object to user
+* land! Due to lazy allocation, scheduler re-ordering, pre-emption,
+* etc., there is no guarantee at all about the validity or
+* sequentiality of the fence's seqno! It is also unsafe to let
+* anything outside of the i915 driver get hold of the fence object
+* as the clean up when decrementing the reference count requires
+* holding the driver mutex lock.
+*/
+   struct fence fence;

/** On Which ring this request was generated */
struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
   struct intel_context *ctx,
   struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
+ bool lazy_coherency)
+{
+   return fence_is_signaled(req-fence);
+}
+
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
   struct drm_file *file);

@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
if (req)
-   kref_get(req-ref);
+   fence_get(req-fence);
return req;
  }

@@ -2255,7 +2272,7 @@ static inline void
  i915_gem_request_unreference(struct drm_i915_gem_request *req)
  {
WARN_ON(!mutex_is_locked(req-ring-dev-struct_mutex));
-   kref_put(req-ref, i915_gem_request_free);
+   fence_put(req-fence);
  }

  static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct 
drm_i915_gem_request *req)
return;

dev = req-ring-dev;
-   if (kref_put_mutex(req-ref, i915_gem_request_free, 
dev-struct_mutex))
+   if (kref_put_mutex(req-fence.refcount, fence_release, 
dev-struct_mutex))
mutex_unlock(dev-struct_mutex);


Would it be nicer to add fence_put_mutex(struct fence *, struct mutex *) 
for this? It would avoid the layering violation of requests peeking into 
fence implementation details.



  }

@@ -2284,12 +2301,6 @@ static inline void i915_gem_request_assign(struct 
drm_i915_gem_request **pdst,
  }

  /*
- * XXX: i915_gem_request_completed should be here but currently needs the
- * definition of i915_seqno_passed() which is below. It will be moved in
- * a later patch when the call to i915_seqno_passed() is obsoleted...
- */
-
-/*
   * A command that requires special handling by the command parser.
   */
  struct drm_i915_cmd_descriptor {
@@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
return (int32_t)(seq1 - seq2) = 0;
  }

-static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
- 

Re: [Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-07-21 Thread Daniel Vetter
On Fri, Jul 17, 2015 at 03:31:17PM +0100, john.c.harri...@intel.com wrote:
 From: John Harrison john.c.harri...@intel.com
 
 There is a construct in the linux kernel called 'struct fence' that is 
 intended
 to keep track of work that is executed on hardware. I.e. it solves the basic
 problem that the drivers 'struct drm_i915_gem_request' is trying to address. 
 The
 request structure does quite a lot more than simply track the execution 
 progress
 so is very definitely still required. However, the basic completion status 
 side
 could be updated to use the ready made fence implementation and gain all the
 advantages that provides.
 
 This patch makes the first step of integrating a struct fence into the 
 request.
 It replaces the explicit reference count with that of the fence. It also
 replaces the 'is completed' test with the fence's equivalent. Currently, that
 simply chains on to the original request implementation. A future patch will
 improve this.
 
 For: VIZ-5190
 Signed-off-by: John Harrison john.c.harri...@intel.com
 ---
  drivers/gpu/drm/i915/i915_drv.h | 45 +
  drivers/gpu/drm/i915/i915_gem.c | 58 
 ++---
  drivers/gpu/drm/i915/intel_lrc.c|  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index cf6761c..79d346c 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -50,6 +50,7 @@
  #include linux/intel-iommu.h
  #include linux/kref.h
  #include linux/pm_qos.h
 +#include linux/fence.h
  
  /* General customization:
   */
 @@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
 - struct kref ref;
 + /**
 +  * Underlying object for implementing the signal/wait stuff.
 +  * NB: Never call fence_later() or return this fence object to user
 +  * land! Due to lazy allocation, scheduler re-ordering, pre-emption,
 +  * etc., there is no guarantee at all about the validity or
 +  * sequentiality of the fence's seqno! It is also unsafe to let
 +  * anything outside of the i915 driver get hold of the fence object
 +  * as the clean up when decrementing the reference count requires
 +  * holding the driver mutex lock.
 +  */

This comment is outdated. Also I'm leaning towards squashing this patch
with the one implementing fences with explicit irq enabling, to avoid
churn and intermediate WARN_ONs. Each patch should be fully functional
without requiring follow-up patches.
-Daniel


 + struct fence fence;
  
   /** On Which ring this request was generated */
   struct drm_i915_private *i915;
 @@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct intel_engine_cs 
 *ring,
  struct intel_context *ctx,
  struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 -void i915_gem_request_free(struct kref *req_ref);
 +
 +static inline bool i915_gem_request_completed(struct drm_i915_gem_request 
 *req,
 +   bool lazy_coherency)
 +{
 + return fence_is_signaled(req-fence);
 +}
 +
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
  struct drm_file *file);
  
 @@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
   if (req)
 - kref_get(req-ref);
 + fence_get(req-fence);
   return req;
  }
  
 @@ -2255,7 +2272,7 @@ static inline void
  i915_gem_request_unreference(struct drm_i915_gem_request *req)
  {
   WARN_ON(!mutex_is_locked(req-ring-dev-struct_mutex));
 - kref_put(req-ref, i915_gem_request_free);
 + fence_put(req-fence);
  }
  
  static inline void
 @@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct 
 drm_i915_gem_request *req)
   return;
  
   dev = req-ring-dev;
 - if (kref_put_mutex(req-ref, i915_gem_request_free, 
 dev-struct_mutex))
 + if (kref_put_mutex(req-fence.refcount, fence_release, 
 dev-struct_mutex))
   mutex_unlock(dev-struct_mutex);
  }
  
 @@ -2284,12 +2301,6 @@ static inline void i915_gem_request_assign(struct 
 drm_i915_gem_request **pdst,
  }
  
  /*
 - * XXX: i915_gem_request_completed should be here but currently needs the
 - * definition of i915_seqno_passed() which is below. It will be moved in
 - * a later patch when the call to i915_seqno_passed() is obsoleted...
 - */
 -
 -/*
   * A command that requires special handling by the command parser.
   */
  struct drm_i915_cmd_descriptor {
 @@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
   

[Intel-gfx] [RFC 3/9] drm/i915: Convert requests to use struct fence

2015-07-17 Thread John . C . Harrison
From: John Harrison john.c.harri...@intel.com

There is a construct in the linux kernel called 'struct fence' that is intended
to keep track of work that is executed on hardware. I.e. it solves the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to address. The
request structure does quite a lot more than simply track the execution progress
so is very definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain all the
advantages that provides.

This patch makes the first step of integrating a struct fence into the request.
It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent. Currently, that
simply chains on to the original request implementation. A future patch will
improve this.

For: VIZ-5190
Signed-off-by: John Harrison john.c.harri...@intel.com
---
 drivers/gpu/drm/i915/i915_drv.h | 45 +
 drivers/gpu/drm/i915/i915_gem.c | 58 ++---
 drivers/gpu/drm/i915/intel_lrc.c|  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
 5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
 #include linux/intel-iommu.h
 #include linux/kref.h
 #include linux/pm_qos.h
+#include linux/fence.h
 
 /* General customization:
  */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
  * initial reference taken using kref_init
  */
 struct drm_i915_gem_request {
-   struct kref ref;
+   /**
+* Underlying object for implementing the signal/wait stuff.
+* NB: Never call fence_later() or return this fence object to user
+* land! Due to lazy allocation, scheduler re-ordering, pre-emption,
+* etc., there is no guarantee at all about the validity or
+* sequentiality of the fence's seqno! It is also unsafe to let
+* anything outside of the i915 driver get hold of the fence object
+* as the clean up when decrementing the reference count requires
+* holding the driver mutex lock.
+*/
+   struct fence fence;
 
/** On Which ring this request was generated */
struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
   struct intel_context *ctx,
   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
+ bool lazy_coherency)
+{
+   return fence_is_signaled(req-fence);
+}
+
 int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
   struct drm_file *file);
 
@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
 i915_gem_request_reference(struct drm_i915_gem_request *req)
 {
if (req)
-   kref_get(req-ref);
+   fence_get(req-fence);
return req;
 }
 
@@ -2255,7 +2272,7 @@ static inline void
 i915_gem_request_unreference(struct drm_i915_gem_request *req)
 {
WARN_ON(!mutex_is_locked(req-ring-dev-struct_mutex));
-   kref_put(req-ref, i915_gem_request_free);
+   fence_put(req-fence);
 }
 
 static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct 
drm_i915_gem_request *req)
return;
 
dev = req-ring-dev;
-   if (kref_put_mutex(req-ref, i915_gem_request_free, 
dev-struct_mutex))
+   if (kref_put_mutex(req-fence.refcount, fence_release, 
dev-struct_mutex))
mutex_unlock(dev-struct_mutex);
 }
 
@@ -2284,12 +2301,6 @@ static inline void i915_gem_request_assign(struct 
drm_i915_gem_request **pdst,
 }
 
 /*
- * XXX: i915_gem_request_completed should be here but currently needs the
- * definition of i915_seqno_passed() which is below. It will be moved in
- * a later patch when the call to i915_seqno_passed() is obsoleted...
- */
-
-/*
  * A command that requires special handling by the command parser.
  */
 struct drm_i915_cmd_descriptor {
@@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
return (int32_t)(seq1 - seq2) = 0;
 }
 
-static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
- bool lazy_coherency)
-{
-   u32 seqno;
-
-   BUG_ON(req == NULL);
-
-   seqno = req-ring-get_seqno(req-ring, lazy_coherency);
-
-   return i915_seqno_passed(seqno, req-seqno);
-}
-
 int __must_check i915_gem_get_seqno(struct