Re: [Intel-gfx] [PATCH v4 4/6] drm/i915/guc: (re)initialise doorbell h/w when enabling GuC submission

2016-04-13 Thread Yu Dai



On 04/13/2016 12:46 PM, Dave Gordon wrote:

On 13/04/16 18:50, Yu Dai wrote:
>
>
> On 04/07/2016 10:21 AM, Dave Gordon wrote:
>> During a hibernate/resume cycle, the whole system is reset, including
>> the GuC and the doorbell hardware. Then the system is booted up, drivers
>> are loaded, etc -- the GuC firmware may be loaded and set running at this
>> point. But then, the booted kernel is replaced by the hibernated image,
>> and this resumed kernel will also try to reload the GuC firmware (which
>> will fail). To recover, we reset the GuC and try again (which should
>> work). But this GuC reset doesn't also reset the doorbell hardware, so
>> it can be left in a state inconsistent with that assumed by the driver
>> and the GuC.
>>
>> It would be better if the GuC reset also cleared all doorbell state,
>> but that's not how the hardware currently works; also, the driver cannot
>> directly reprogram the doorbell hardware (only the GuC can do that).
>>
>> So this patch cycles through all doorbells, assigning and releasing each
>> in turn, so that all the doorbell hardware is left in a consistent state,
>> no matter how it was programmed by the previously-running kernel and/or
>> GuC firmware.
>>
>> This patch can be removed if/when the GuC firmware is updated so that it
>> (re)initialises the doorbell hardware after every firmware (re)load.
>>
>> Signed-off-by: Dave Gordon <david.s.gor...@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 46
>> +-
>>   1 file changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 2fc69f1..f466eab 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -707,6 +707,50 @@ static void guc_client_free(struct drm_device *dev,
>>   kfree(client);
>>   }
>> +/*
>> + * Borrow the first client to set up & tear down every doorbell
>> + * in turn, to ensure that all doorbell h/w is (re)initialised.
>> + */
>> +static void guc_init_doorbell_hw(struct intel_guc *guc)
>> +{
>> +struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +struct i915_guc_client *client = guc->execbuf_client;
>> +struct guc_doorbell_info *doorbell;
>> +uint16_t db_id, i;
>> +void *base;
>> +int ret;
>> +
>> +base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
>> +doorbell = base + client->doorbell_offset;
>> +db_id = client->doorbell_id;
>> +
>> +for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
>> +i915_reg_t drbreg = GEN8_DRBREGL(i);
>> +u32 value = I915_READ(drbreg);
>> +
>> +ret = guc_update_doorbell_id(client, doorbell, i);
>> +
>> +if (((value & GUC_DOORBELL_ENABLED) && (i != db_id)) || ret)
>> +DRM_DEBUG_DRIVER("Doorbell reg 0x%x was 0x%x, ret %d\n",
>> +drbreg.reg, value, ret);
>> +}
>> +
>> +/* Restore to original value */
>> +guc_update_doorbell_id(client, doorbell, db_id);
>> +
>> +for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
>> +i915_reg_t drbreg = GEN8_DRBREGL(i);
>> +u32 value = I915_READ(drbreg);
>> +
>> +if ((value & GUC_DOORBELL_ENABLED) && (i != db_id))
>> +DRM_DEBUG_DRIVER("Doorbell reg 0x%x finally 0x%x\n",
>> +drbreg.reg, value);
>> +
>> +}
>> +
>
> The for loop above is not needed. It can be merged into previous loop by
> print out new drbreg value (read it again after update_doorbell_id).
>
> At this point, only need to check if db_id is correctly enabled or not
> by print out I915_READ(GEN8_DRBREGL(db_id)).
>
> Alex

No, the idea is not to check that the GuC call has *enabled* each
selected doorbell, but to check that after the end of the first loop,
and the subsequent restore, all *other* doorbells have been *disabled*.
We're only *selecting* each doorbell so that we can then *deselect* it
as a side effect of selecting the next one!

Hence separate loop required ...

.Dave.


This still can be done by backup of previous client->doorbell_id. If it 
is not same as the desired db_id, then make sure it is *disabled* after 
the update.


The real problem here, at least not for now, is that it assumes there is 
only one guc_client. In future, if there is user created guc_client, the 
code doesn't restore doorbell for it.


Alex


>> +kunmap_atomic(base);
>> +}
>> +
>>   /**
>>* guc_client_alloc() - Allocate an i915_guc_client
>>* @dev:drm device
>> @@ -971,8 +1015,8 @@ int i915_guc_submission_enable(struct drm_device
>> *dev)
>>   }
>>   guc->execbuf_client = client;
>> -
>>   host2guc_sample_forcewake(guc, client);
>> +guc_init_doorbell_hw(guc);
>>   return 0;
>>   }
>



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 2/6] drm/i915/guc: move guc_ring_doorbell() nearer to callsite

2016-04-13 Thread Yu Dai

LGTM.

Reviewed-by: Alex Dai 

On 04/07/2016 10:21 AM, Dave Gordon wrote:

Just code movement, no actual change to the function. This is in
preparation for the next patch, which will reorganise all the other
doorbell code, but doesn't change this function. So let's shuffle it
down near its caller rather than leaving it mixed in with the setup
code. Unlike the doorbell management code, this function is somewhat
time-critical, so putting it near its caller may even yield a tiny
performance improvement.

Signed-off-by: Dave Gordon 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 128 +++--
  1 file changed, 67 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index da86bdb..2171759 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -190,67 +190,6 @@ static void guc_init_doorbell(struct intel_guc *guc,
kunmap_atomic(base);
  }
  
-static int guc_ring_doorbell(struct i915_guc_client *gc)

-{
-   struct guc_process_desc *desc;
-   union guc_doorbell_qw db_cmp, db_exc, db_ret;
-   union guc_doorbell_qw *db;
-   void *base;
-   int attempt = 2, ret = -EAGAIN;
-
-   base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
-   desc = base + gc->proc_desc_offset;
-
-   /* Update the tail so it is visible to GuC */
-   desc->tail = gc->wq_tail;
-
-   /* current cookie */
-   db_cmp.db_status = GUC_DOORBELL_ENABLED;
-   db_cmp.cookie = gc->cookie;
-
-   /* cookie to be updated */
-   db_exc.db_status = GUC_DOORBELL_ENABLED;
-   db_exc.cookie = gc->cookie + 1;
-   if (db_exc.cookie == 0)
-   db_exc.cookie = 1;
-
-   /* pointer of current doorbell cacheline */
-   db = base + gc->doorbell_offset;
-
-   while (attempt--) {
-   /* lets ring the doorbell */
-   db_ret.value_qw = atomic64_cmpxchg((atomic64_t *)db,
-   db_cmp.value_qw, db_exc.value_qw);
-
-   /* if the exchange was successfully executed */
-   if (db_ret.value_qw == db_cmp.value_qw) {
-   /* db was successfully rung */
-   gc->cookie = db_exc.cookie;
-   ret = 0;
-   break;
-   }
-
-   /* XXX: doorbell was lost and need to acquire it again */
-   if (db_ret.db_status == GUC_DOORBELL_DISABLED)
-   break;
-
-   DRM_ERROR("Cookie mismatch. Expected %d, returned %d\n",
- db_cmp.cookie, db_ret.cookie);
-
-   /* update the cookie to newly read cookie from GuC */
-   db_cmp.cookie = db_ret.cookie;
-   db_exc.cookie = db_ret.cookie + 1;
-   if (db_exc.cookie == 0)
-   db_exc.cookie = 1;
-   }
-
-   /* Finally, update the cached copy of the GuC's WQ head */
-   gc->wq_head = desc->head;
-
-   kunmap_atomic(base);
-   return ret;
-}
-
  static void guc_disable_doorbell(struct intel_guc *guc,
 struct i915_guc_client *client)
  {
@@ -471,6 +410,12 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 sizeof(desc) * client->ctx_index);
  }
  
+/*

+ * Everything above here is concerned with setup & teardown, and is
+ * therefore not part of the somewhat time-critical batch-submission
+ * path of i915_guc_submit() below.
+ */
+
  int i915_guc_wq_check_space(struct i915_guc_client *gc)
  {
struct guc_process_desc *desc;
@@ -559,6 +504,67 @@ static int guc_add_workqueue_item(struct i915_guc_client 
*gc,
return 0;
  }
  
+static int guc_ring_doorbell(struct i915_guc_client *gc)

+{
+   struct guc_process_desc *desc;
+   union guc_doorbell_qw db_cmp, db_exc, db_ret;
+   union guc_doorbell_qw *db;
+   void *base;
+   int attempt = 2, ret = -EAGAIN;
+
+   base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
+   desc = base + gc->proc_desc_offset;
+
+   /* Update the tail so it is visible to GuC */
+   desc->tail = gc->wq_tail;
+
+   /* current cookie */
+   db_cmp.db_status = GUC_DOORBELL_ENABLED;
+   db_cmp.cookie = gc->cookie;
+
+   /* cookie to be updated */
+   db_exc.db_status = GUC_DOORBELL_ENABLED;
+   db_exc.cookie = gc->cookie + 1;
+   if (db_exc.cookie == 0)
+   db_exc.cookie = 1;
+
+   /* pointer of current doorbell cacheline */
+   db = base + gc->doorbell_offset;
+
+   while (attempt--) {
+   /* lets ring the doorbell */
+   db_ret.value_qw = atomic64_cmpxchg((atomic64_t *)db,
+   db_cmp.value_qw, db_exc.value_qw);
+
+   /* if the exchange was successfully executed */
+   if (db_ret.value_qw 

Re: [Intel-gfx] [PATCH v4 1/6] drm/i915/guc: add doorbell map to debugfs/i915_guc_info

2016-04-13 Thread Yu Dai

LGTM.

Reviewed-by: Alex Dai 

On 04/07/2016 10:21 AM, Dave Gordon wrote:

To properly verify the driver->doorbell->GuC functionality, validation
needs to know how the driver has assigned the doorbell cache lines and
registers, so make them visible through debugfs.

Signed-off-by: Dave Gordon 
---
  drivers/gpu/drm/i915/i915_debugfs.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index be4bcdc..87a9f3e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2488,6 +2488,7 @@ static int i915_guc_info(struct seq_file *m, void *data)
struct i915_guc_client client = {};
struct intel_engine_cs *engine;
u64 total = 0;
+   int i;
  
  	if (!HAS_GUC_SCHED(dev_priv))

return 0;
@@ -2502,6 +2503,13 @@ static int i915_guc_info(struct seq_file *m, void *data)
  
  	mutex_unlock(>struct_mutex);
  
+	seq_printf(m, "Doorbell map:\n");

+   for (i = 0; i < BITS_TO_LONGS(GUC_MAX_DOORBELLS) - 3; i += 4)
+   seq_printf(m, "\t%016lx %016lx %016lx %016lx\n",
+   guc.doorbell_bitmap[i], guc.doorbell_bitmap[i+1],
+   guc.doorbell_bitmap[i+2], guc.doorbell_bitmap[i+3]);
+   seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc.db_cacheline);
+
seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
seq_printf(m, "GuC last action command: 0x%x\n", guc.action_cmd);


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 5/6] drm/i915/guc: disable GuC submission earlier during GuC (re)load

2016-04-13 Thread Yu Dai

LGTM.

Reviewed-by: Alex Dai 

On 04/07/2016 10:21 AM, Dave Gordon wrote:

When resetting and reloading the GuC, the GuC submission management code
also needs to destroy and recreate the GuC client(s). Currently this is
done by a separate call from the GuC loader, but really, it's just an
internal detail of the submission code. So here we remove the call from
the loader (which is too late, really, because the GuC has already been
reloaded at this point) and put it into guc_submission_init() instead.
This means that any preexisting client is destroyed *before* the GuC
(re)load and then recreated after, iff the firmware was successfully
loaded. If the GuC reload fails, we don't recreate the client, so fallback
to execlists mode (if active) won't leak the client object (previously,
the now-unusable client would have been left allocated, and leaked if
the driver were unloaded).

Signed-off-by: Dave Gordon 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 6 --
  drivers/gpu/drm/i915/intel_guc_loader.c| 3 ---
  2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index f466eab..a8717f7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -981,6 +981,10 @@ int i915_guc_submission_init(struct drm_device *dev)
const size_t gemsize = round_up(poolsize, PAGE_SIZE);
struct intel_guc *guc = _priv->guc;
  
+	/* Wipe bitmap & delete client in case of reinitialisation */

+   bitmap_clear(guc->doorbell_bitmap, 0, GUC_MAX_DOORBELLS);
+   i915_guc_submission_disable(dev);
+
if (!i915.enable_guc_submission)
return 0; /* not enabled  */
  
@@ -992,9 +996,7 @@ int i915_guc_submission_init(struct drm_device *dev)

return -ENOMEM;
  
  	ida_init(>ctx_ids);

-
guc_create_log(guc);
-
guc_create_ads(guc);
  
  	return 0;

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 876e5da..3e14a9a 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -470,9 +470,6 @@ int intel_guc_ucode_load(struct drm_device *dev)
intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
  
  	if (i915.enable_guc_submission) {

-   /* The execbuf_client will be recreated. Release it first. */
-   i915_guc_submission_disable(dev);
-
err = i915_guc_submission_enable(dev);
if (err)
goto fail;


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 4/6] drm/i915/guc: (re)initialise doorbell h/w when enabling GuC submission

2016-04-13 Thread Yu Dai



On 04/07/2016 10:21 AM, Dave Gordon wrote:

During a hibernate/resume cycle, the whole system is reset, including
the GuC and the doorbell hardware. Then the system is booted up, drivers
are loaded, etc -- the GuC firmware may be loaded and set running at this
point. But then, the booted kernel is replaced by the hibernated image,
and this resumed kernel will also try to reload the GuC firmware (which
will fail). To recover, we reset the GuC and try again (which should
work). But this GuC reset doesn't also reset the doorbell hardware, so
it can be left in a state inconsistent with that assumed by the driver
and the GuC.

It would be better if the GuC reset also cleared all doorbell state,
but that's not how the hardware currently works; also, the driver cannot
directly reprogram the doorbell hardware (only the GuC can do that).

So this patch cycles through all doorbells, assigning and releasing each
in turn, so that all the doorbell hardware is left in a consistent state,
no matter how it was programmed by the previously-running kernel and/or
GuC firmware.

This patch can be removed if/when the GuC firmware is updated so that it
(re)initialises the doorbell hardware after every firmware (re)load.

Signed-off-by: Dave Gordon 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 46 +-
  1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2fc69f1..f466eab 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -707,6 +707,50 @@ static void guc_client_free(struct drm_device *dev,
kfree(client);
  }
  
+/*

+ * Borrow the first client to set up & tear down every doorbell
+ * in turn, to ensure that all doorbell h/w is (re)initialised.
+ */
+static void guc_init_doorbell_hw(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct i915_guc_client *client = guc->execbuf_client;
+   struct guc_doorbell_info *doorbell;
+   uint16_t db_id, i;
+   void *base;
+   int ret;
+
+   base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
+   doorbell = base + client->doorbell_offset;
+   db_id = client->doorbell_id;
+
+   for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
+   i915_reg_t drbreg = GEN8_DRBREGL(i);
+   u32 value = I915_READ(drbreg);
+
+   ret = guc_update_doorbell_id(client, doorbell, i);
+
+   if (((value & GUC_DOORBELL_ENABLED) && (i != db_id)) || ret)
+   DRM_DEBUG_DRIVER("Doorbell reg 0x%x was 0x%x, ret %d\n",
+   drbreg.reg, value, ret);
+   }
+
+   /* Restore to original value */
+   guc_update_doorbell_id(client, doorbell, db_id);
+
+   for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
+   i915_reg_t drbreg = GEN8_DRBREGL(i);
+   u32 value = I915_READ(drbreg);
+
+   if ((value & GUC_DOORBELL_ENABLED) && (i != db_id))
+   DRM_DEBUG_DRIVER("Doorbell reg 0x%x finally 0x%x\n",
+   drbreg.reg, value);
+
+   }
+


The for loop above is not needed. It can be merged into previous loop by 
print out new drbreg value (read it again after update_doorbell_id).


At this point, only need to check if db_id is correctly enabled or not 
by print out I915_READ(GEN8_DRBREGL(db_id)).


Alex


+   kunmap_atomic(base);
+}
+
  /**
   * guc_client_alloc() - Allocate an i915_guc_client
   * @dev:  drm device
@@ -971,8 +1015,8 @@ int i915_guc_submission_enable(struct drm_device *dev)
}
  
  	guc->execbuf_client = client;

-
host2guc_sample_forcewake(guc, client);
+   guc_init_doorbell_hw(guc);
  
  	return 0;

  }


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Apply WaC6DisallowByGfxPause prior to SKL B0 / BXT A1

2016-04-07 Thread yu . dai
From: Alex Dai 

No need for this workaround since SKL C0 and BXT B0.

Issue: VIZ-7615
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index c0e5a01..ac85c28 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -327,15 +327,15 @@ static int guc_ucode_xfer(struct drm_i915_private 
*dev_priv)
/* Enable MIA caching. GuC clock gating is disabled. */
I915_WRITE(GUC_SHIM_CONTROL, GUC_SHIM_CONTROL_VALUE);
 
-   /* WaDisableMinuteIaClockGating:skl,bxt */
if (IS_SKL_REVID(dev, 0, SKL_REVID_B0) ||
IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+   /* WaDisableMinuteIaClockGating:skl,bxt */
I915_WRITE(GUC_SHIM_CONTROL, (I915_READ(GUC_SHIM_CONTROL) &
  ~GUC_ENABLE_MIA_CLOCK_GATING));
-   }
 
-   /* WaC6DisallowByGfxPause*/
-   I915_WRITE(GEN6_GFXPAUSE, 0x30FFF);
+   /* WaC6DisallowByGfxPause*/
+   I915_WRITE(GEN6_GFXPAUSE, 0x30FFF);
+   }
 
if (IS_BROXTON(dev))
I915_WRITE(GEN9LP_GT_PM_CONFIG, GT_DOORBELL_ENABLE);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 3/6] drm/i915/guc: refactor doorbell management code

2016-04-07 Thread Yu Dai



On 04/07/2016 10:21 AM, Dave Gordon wrote:

During a hibernate/resume cycle, the driver, the GuC, and the doorbell
hardware can end up in inconsistent states. This patch refactors the
driver's handling and tracking of doorbells, in preparation for a later
one which will resolve the issue.

Signed-off-by: Dave Gordon 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 88 ++
  1 file changed, 53 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2171759..2fc69f1 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -175,8 +175,48 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
   * client object which contains the page being used for the doorbell
   */
  
+static int guc_update_doorbell_id(struct i915_guc_client *client,

+ struct guc_doorbell_info *doorbell,
+ u16 new_id)
+{
+   struct sg_table *sg = client->guc->ctx_pool_obj->pages;
+   void *doorbell_bitmap = client->guc->doorbell_bitmap;
+   struct guc_context_desc desc;
+   size_t len;
+
+   if (client->doorbell_id != GUC_INVALID_DOORBELL_ID &&
+   test_bit(client->doorbell_id, doorbell_bitmap)) {
+   /* Deactivate the old doorbell */
+   doorbell->db_status = GUC_DOORBELL_DISABLED;
+   (void)host2guc_release_doorbell(client->guc, client);
+   clear_bit(client->doorbell_id, doorbell_bitmap);
+   }
+
+   /* Update the GuC's idea of the doorbell ID */
+   len = sg_pcopy_to_buffer(sg->sgl, sg->nents, , sizeof(desc),
+sizeof(desc) * client->ctx_index);
+   if (len != sizeof(desc))
+   return -EFAULT;
+   desc.db_id = new_id;
+   len = sg_pcopy_from_buffer(sg->sgl, sg->nents, , sizeof(desc),
+sizeof(desc) * client->ctx_index);
+   if (len != sizeof(desc))
+   return -EFAULT;
+


We may cache the vmap of context pool for its life cycle to avoid these 
copies. That is why a generic vmap helper function is really nice to have.


Alex

+   client->doorbell_id = new_id;
+   if (new_id == GUC_INVALID_DOORBELL_ID)
+   return 0;
+
+   /* Activate the new doorbell */
+   set_bit(client->doorbell_id, doorbell_bitmap);
+   doorbell->db_status = GUC_DOORBELL_ENABLED;
+   doorbell->cookie = 0;
+   return host2guc_allocate_doorbell(client->guc, client);
+}
+
  static void guc_init_doorbell(struct intel_guc *guc,
- struct i915_guc_client *client)
+ struct i915_guc_client *client,
+ uint16_t db_id)
  {
struct guc_doorbell_info *doorbell;
void *base;
@@ -184,8 +224,7 @@ static void guc_init_doorbell(struct intel_guc *guc,
base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
doorbell = base + client->doorbell_offset;
  
-	doorbell->db_status = 1;

-   doorbell->cookie = 0;
+   guc_update_doorbell_id(client, doorbell, db_id);
  
  	kunmap_atomic(base);

  }
@@ -193,27 +232,16 @@ static void guc_init_doorbell(struct intel_guc *guc,
  static void guc_disable_doorbell(struct intel_guc *guc,
 struct i915_guc_client *client)
  {
-   struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct guc_doorbell_info *doorbell;
void *base;
-   i915_reg_t drbreg = GEN8_DRBREGL(client->doorbell_id);
-   int value;
  
  	base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));

doorbell = base + client->doorbell_offset;
  
-	doorbell->db_status = 0;

+   guc_update_doorbell_id(client, doorbell, GUC_INVALID_DOORBELL_ID);
  
  	kunmap_atomic(base);
  
-	I915_WRITE(drbreg, I915_READ(drbreg) & ~GEN8_DRB_VALID);

-
-   value = I915_READ(drbreg);
-   WARN_ON((value & GEN8_DRB_VALID) != 0);
-
-   I915_WRITE(GEN8_DRBREGU(client->doorbell_id), 0);
-   I915_WRITE(drbreg, 0);
-
/* XXX: wait for any interrupts */
/* XXX: wait for workqueue to drain */
  }
@@ -260,7 +288,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
if (id == end)
id = GUC_INVALID_DOORBELL_ID;
else
-   bitmap_set(guc->doorbell_bitmap, id, 1);
+   set_bit(id, guc->doorbell_bitmap);
  
  	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",

hi_pri ? "high" : "normal", id);
@@ -268,11 +296,6 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
return id;
  }
  
-static void release_doorbell(struct intel_guc *guc, uint16_t id)

-{
-   bitmap_clear(guc->doorbell_bitmap, id, 1);
-}
-
  /*
   * Initialise the process descriptor shared with the 

Re: [Intel-gfx] [PATCH v2 2/2] drm/i915/guc: Reset GuC and retry on firmware load failure

2016-03-10 Thread Yu Dai

LGTM. Reviewed-by: Alex Dai 

On 03/08/2016 03:38 AM, Arun Siluvery wrote:

Due to timing issues in the HW some of the status bits required for GuC
authentication doesn't get set occassionally, when that happens, GuC cannot
be initialized and we will be left with a wedged GPU. The WA suggested is
to perform a soft reset of GuC and attempt to reload the fw again for few
times before giving up.

As the failure is dependent on timing, tests performed by triggering manual
full gpu reset (i915_wedged) showed that we could sometimes hit this after
several thousand iterations but sometimes tests ran even longer without any
issues. Reset and reload mechanism proved helpful when we indeed hit fw
load failure so it is better to include this to improve driver stability.

This change implements the following WA,

WaEnableuKernelHeaderValidFix:skl,bxt
WaEnableGuCBootHashCheckNotSet:skl,bxt

Cc: Dave Gordon 
Cc: Alex Dai 
Signed-off-by: Arun Siluvery 
---
  drivers/gpu/drm/i915/i915_drv.h |  1 +
  drivers/gpu/drm/i915/i915_guc_reg.h |  1 +
  drivers/gpu/drm/i915/i915_reg.h |  1 +
  drivers/gpu/drm/i915/intel_guc_loader.c | 49 +++--
  drivers/gpu/drm/i915/intel_uncore.c | 19 +
  5 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f37ac12..0df7c82 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2757,6 +2757,7 @@ extern long i915_compat_ioctl(struct file *filp, unsigned 
int cmd,
  extern int intel_gpu_reset(struct drm_device *dev);
  extern bool intel_has_gpu_reset(struct drm_device *dev);
  extern int i915_reset(struct drm_device *dev);
+extern int intel_guc_reset(struct drm_i915_private *dev_priv);
  extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv);
  extern unsigned long i915_mch_val(struct drm_i915_private *dev_priv);
  extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index e4ba582..94ceee5 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -27,6 +27,7 @@
  /* Definitions of GuC H/W registers, bits, etc */
  
  #define GUC_STATUS			_MMIO(0xc000)

+#define   GS_MIA_IN_RESET  (1 << 0)
  #define   GS_BOOTROM_SHIFT1
  #define   GS_BOOTROM_MASK   (0x7F << GS_BOOTROM_SHIFT)
  #define   GS_BOOTROM_RSA_FAILED (0x50 << GS_BOOTROM_SHIFT)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7dfc400..48a23de 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -164,6 +164,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
  #define  GEN6_GRDOM_RENDER(1 << 1)
  #define  GEN6_GRDOM_MEDIA (1 << 2)
  #define  GEN6_GRDOM_BLT   (1 << 3)
+#define  GEN9_GRDOM_GUC(1 << 5)
  
  #define RING_PP_DIR_BASE(ring)		_MMIO((ring)->mmio_base+0x228)

  #define RING_PP_DIR_BASE_READ(ring)   _MMIO((ring)->mmio_base+0x518)
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 82a3c03..f9cb814 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -353,6 +353,24 @@ static int guc_ucode_xfer(struct drm_i915_private 
*dev_priv)
return ret;
  }
  
+static int i915_reset_guc(struct drm_i915_private *dev_priv)

+{
+   int ret;
+   u32 guc_status;
+
+   ret = intel_guc_reset(dev_priv);
+   if (ret) {
+   DRM_ERROR("GuC reset failed, ret = %d\n", ret);
+   return ret;
+   }
+
+   guc_status = I915_READ(GUC_STATUS);
+   WARN(!(guc_status & GS_MIA_IN_RESET),
+"GuC status: 0x%x, MIA core expected to be in reset\n", 
guc_status);
+
+   return ret;
+}
+
  /**
   * intel_guc_ucode_load() - load GuC uCode into the device
   * @dev:  drm device
@@ -417,9 +435,36 @@ int intel_guc_ucode_load(struct drm_device *dev)
if (err)
goto fail;
  
+	/*

+* WaEnableuKernelHeaderValidFix:skl,bxt
+* For BXT, this is only upto B0 but below WA is required for later
+* steppings also so this is extended as well.
+*/
+   /* WaEnableGuCBootHashCheckNotSet:skl,bxt */
err = guc_ucode_xfer(dev_priv);
-   if (err)
-   goto fail;
+   if (err) {
+   int retries = 3;
+
+   DRM_ERROR("GuC fw load failed, err=%d, attempting reset and 
retry\n", err);
+
+   while (retries--) {
+   err = i915_reset_guc(dev_priv);
+   if (err)
+   break;
+
+   err = guc_ucode_xfer(dev_priv);
+ 

[Intel-gfx] [PATCH] drm/i915/guc: Support GuC SKL v6.1

2016-02-24 Thread yu . dai
From: Alex Dai 

This version of GuC firmware fixes the engine reset issue where golden
context LRC address is treated as page index by mistake. It also fixes
the problem that scheduler stops submiting to one engine when the other
engine work queue is full.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index e0093a9..e329a8a 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -59,7 +59,7 @@
  *
  */
 
-#define I915_SKL_GUC_UCODE "i915/skl_guc_ver4.bin"
+#define I915_SKL_GUC_UCODE "i915/skl_guc_ver6.bin"
 MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
 
 /* User-friendly representation of an enum */
@@ -611,8 +611,8 @@ void intel_guc_ucode_init(struct drm_device *dev)
fw_path = NULL;
} else if (IS_SKYLAKE(dev)) {
fw_path = I915_SKL_GUC_UCODE;
-   guc_fw->guc_fw_major_wanted = 4;
-   guc_fw->guc_fw_minor_wanted = 3;
+   guc_fw->guc_fw_major_wanted = 6;
+   guc_fw->guc_fw_minor_wanted = 1;
} else {
fw_path = "";   /* unknown device */
}
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: add enable_guc_loading parameter

2016-02-19 Thread Yu Dai

LGTM.

Reviewed-by: Alex Dai 

Thanks,
Alex

On 02/03/2016 04:56 AM, Dave Gordon wrote:

Split the function of "enable_guc_submission" into two separate options.
The new one "enable_guc_loading" controls only the *fetching and loading*
of the GuC firmware image. The existing one is redefined to control only
the *use* of the GuC for batch submission once the firmware is loaded.

In addition, the degree of control has been refined from a simple bool
to an integer key, allowing several options:
  -1  (default)whatever the platform default is
   0  DISABLE  don't load/use the GuC
   1  BEST EFFORT  try to load/use the GuC, fallback if not available
   2  REQUIRE  must load/use the GuC, else leave the GPU wedged

The new platform default (as coded here) will be to attempt to load
the GuC iff the device has a GuC that requires firmware, to attempt to
use it iff the device has a GuC that supports the submission protocol
(with or without firmware), and to fall back to execlist mode if any
required firmware cannot be found or fails to load.

Signed-off-by: Dave Gordon 
Cc: Jani Nikula 
---
  drivers/gpu/drm/i915/i915_gem.c |  1 -
  drivers/gpu/drm/i915/i915_params.c  | 14 -
  drivers/gpu/drm/i915/i915_params.h  |  3 +-
  drivers/gpu/drm/i915/intel_guc_loader.c | 99 ++---
  4 files changed, 68 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a928823..cd7aada 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4892,7 +4892,6 @@ int i915_gem_init_rings(struct drm_device *dev)
ret = intel_guc_ucode_load(dev);
if (ret) {
DRM_ERROR("Failed to initialize GuC, error %d\n", ret);
-   ret = -EIO;
goto out;
}
}
diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 8b9f368..ad785ad 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -55,7 +55,8 @@ struct i915_params i915 __read_mostly = {
.verbose_state_checks = 1,
.nuclear_pageflip = 0,
.edp_vswing = 0,
-   .enable_guc_submission = false,
+   .enable_guc_loading = -1,
+   .enable_guc_submission = -1,
.guc_log_level = -1,
  };
  
@@ -198,8 +199,15 @@ struct i915_params i915 __read_mostly = {

 "(0=use value from vbt [default], 1=low power swing(200mV),"
 "2=default swing(400mV))");
  
-module_param_named_unsafe(enable_guc_submission, i915.enable_guc_submission, bool, 0400);

-MODULE_PARM_DESC(enable_guc_submission, "Enable GuC submission 
(default:false)");
+module_param_named_unsafe(enable_guc_loading, i915.enable_guc_loading, int, 
0400);
+MODULE_PARM_DESC(enable_guc_submission,
+   "Enable GuC firmware loading "
+   "(-1=auto [default], 0=never, 1=if available, 2=required)");
+
+module_param_named_unsafe(enable_guc_submission, i915.enable_guc_submission, 
int, 0400);
+MODULE_PARM_DESC(enable_guc_submission,
+   "Enable GuC submission "
+   "(-1=auto [default], 0=never, 1=if available, 2=required)");
  
  module_param_named(guc_log_level, i915.guc_log_level, int, 0400);

  MODULE_PARM_DESC(guc_log_level,
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index 5299290..7180261 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -45,6 +45,8 @@ struct i915_params {
int enable_ips;
int invert_brightness;
int enable_cmd_parser;
+   int enable_guc_loading;
+   int enable_guc_submission;
int guc_log_level;
int use_mmio_flip;
int mmio_debug;
@@ -57,7 +59,6 @@ struct i915_params {
bool reset;
bool disable_display;
bool disable_vtd_wa;
-   bool enable_guc_submission;
bool verbose_state_checks;
bool nuclear_pageflip;
  };
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 3accd91..e7eb3db 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -369,49 +369,37 @@ int intel_guc_ucode_load(struct drm_device *dev)
  {
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
+   const char *fw_path = guc_fw->guc_fw_path;
int err = 0;
  
-	if (!i915.enable_guc_submission)

-   return 0;
-
-   DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
+   DRM_DEBUG_DRIVER("GuC fw status: path %s, fetch %s, load %s\n",
+   fw_path,
intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
  

Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space

2016-02-18 Thread Yu Dai



On 02/18/2016 01:05 PM, Chris Wilson wrote:

On Thu, Feb 18, 2016 at 10:31:37AM -0800, yu@intel.com wrote:
> From: Alex Dai 
>
> There are several places inside driver where a GEM object is mapped to
> kernel virtual space. The mapping is either done for the whole object
> or certain page range of it.
>
> This patch introduces a function i915_gem_object_vmap to do such job.
>
> v2: Use obj->pages->nents for iteration within i915_gem_object_vmap;
> break when it finishes all desired pages. The caller need to pass
> in actual page number. (Tvrtko Ursulin)

Who owns the pages? vmap doesn't increase the page refcount nor
mapcount, so it is the callers responsibility to keep the pages alive
for the duration of the vmapping.

I suggested i915_gem_object_pin_vmap/unpin_vmap for that reason and that
also provides the foundation for undoing one of the more substantial
performance regressions from vmap_batch().




OK, found it at 050/190 of your patch series. That is a huge list of 
patches. :-) The code I put here does not change (at least tries to 
keep) the current code logic or driver behavior. I am not opposed to 
using i915_gem_object_pin_vmap/unpin_vmap at all. I will now just keep 
eyes on that patch.


Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 1/2] drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space

2016-02-18 Thread yu . dai
From: Alex Dai 

There are several places inside driver where a GEM object is mapped to
kernel virtual space. The mapping is either done for the whole object
or certain page range of it.

This patch introduces a function i915_gem_object_vmap to do such job.

v2: Use obj->pages->nents for iteration within i915_gem_object_vmap;
break when it finishes all desired pages. The caller need to pass
in actual page number. (Tvrtko Ursulin)

Signed-off-by: Alex Dai 
Cc: Dave Gordon 
Cc: Daniel Vetter 
Cc: Tvrtko Ursulin 
Cc: Chris Wilson 
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c  | 28 +---
 drivers/gpu/drm/i915/i915_drv.h |  3 +++
 drivers/gpu/drm/i915/i915_gem.c | 47 +
 drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 16 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++---
 5 files changed, 56 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 814d894..915e8c1 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -863,37 +863,11 @@ find_reg(const struct drm_i915_reg_descriptor *table,
 static u32 *vmap_batch(struct drm_i915_gem_object *obj,
   unsigned start, unsigned len)
 {
-   int i;
-   void *addr = NULL;
-   struct sg_page_iter sg_iter;
int first_page = start >> PAGE_SHIFT;
int last_page = (len + start + 4095) >> PAGE_SHIFT;
int npages = last_page - first_page;
-   struct page **pages;
-
-   pages = drm_malloc_ab(npages, sizeof(*pages));
-   if (pages == NULL) {
-   DRM_DEBUG_DRIVER("Failed to get space for pages\n");
-   goto finish;
-   }
-
-   i = 0;
-   for_each_sg_page(obj->pages->sgl, _iter, obj->pages->nents, 
first_page) {
-   pages[i++] = sg_page_iter_page(_iter);
-   if (i == npages)
-   break;
-   }
-
-   addr = vmap(pages, i, 0, PAGE_KERNEL);
-   if (addr == NULL) {
-   DRM_DEBUG_DRIVER("Failed to vmap pages\n");
-   goto finish;
-   }
 
-finish:
-   if (pages)
-   drm_free_large(pages);
-   return (u32*)addr;
+   return (u32*)i915_gem_object_vmap(obj, first_page, npages);
 }
 
 /* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6644c2e..5b00a6a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2899,6 +2899,9 @@ struct drm_i915_gem_object 
*i915_gem_object_create_from_data(
struct drm_device *dev, const void *data, size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
 void i915_gem_vma_destroy(struct i915_vma *vma);
+void *i915_gem_object_vmap(struct drm_i915_gem_object *obj,
+  unsigned int first,
+  unsigned int npages);
 
 /* Flags used by pin/bind */
 #define PIN_MAPPABLE   (1<<0)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f68f346..4bc0ce7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5356,3 +5356,50 @@ fail:
drm_gem_object_unreference(>base);
return ERR_PTR(ret);
 }
+
+/**
+ * i915_gem_object_vmap - map a GEM obj into kernel virtual space
+ * @obj: the GEM obj to be mapped
+ * @first: index of the first page where mapping starts
+ * @npages: how many pages to be mapped, starting from first page
+ *
+ * Map a given page range of GEM obj into kernel virtual space. The caller must
+ * make sure the associated pages are gathered and pinned before calling this
+ * function. vunmap should be called after use.
+ *
+ * NULL will be returned if fails.
+ */
+void *i915_gem_object_vmap(struct drm_i915_gem_object *obj,
+  unsigned int first,
+  unsigned int npages)
+{
+   struct sg_page_iter sg_iter;
+   struct page **pages;
+   void *addr;
+   int i;
+
+   if (first + npages > obj->pages->nents) {
+   DRM_DEBUG_DRIVER("Invalid page count\n");
+   return NULL;
+   }
+
+   pages = drm_malloc_ab(npages, sizeof(*pages));
+   if (pages == NULL) {
+   DRM_DEBUG_DRIVER("Failed to get space for pages\n");
+   return NULL;
+   }
+
+   i = 0;
+   for_each_sg_page(obj->pages->sgl, _iter, obj->pages->nents, first) {
+   pages[i++] = sg_page_iter_page(_iter);
+   if (i == npages)
+   break;
+   }
+
+   addr = vmap(pages, npages, 0, PAGE_KERNEL);
+   if (addr == NULL)
+   DRM_DEBUG_DRIVER("Failed 

[Intel-gfx] [PATCH v2 0/2] Add i915_gem_object_vmap

2016-02-18 Thread yu . dai
From: Alex Dai 

There are several places in driver that a GEM object is mapped to kernel
virtual space. Now add a common function i915_gem_object_vmap to do the vmap
work for such use case.

Alex Dai (2):
  drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
  drm/i915/guc: Simplify code by keeping vmap of guc_client object

 drivers/gpu/drm/i915/i915_cmd_parser.c | 28 +--
 drivers/gpu/drm/i915/i915_drv.h|  3 ++
 drivers/gpu/drm/i915/i915_gem.c| 47 +
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 16 ++---
 drivers/gpu/drm/i915/i915_guc_submission.c | 56 ++
 drivers/gpu/drm/i915/intel_guc.h   |  3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c| 24 ++---
 7 files changed, 77 insertions(+), 100 deletions(-)

-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 2/2] drm/i915/guc: Simplify code by keeping vmap of guc_client object

2016-02-18 Thread yu . dai
From: Alex Dai 

GuC client object is always pinned during its life cycle. We cache
the vmap of client object, which includes guc_process_desc, doorbell
and work queue. By doing so, we can simplify the code where driver
communicate with GuC.

As a result, this patch removes the kmap_atomic in wq_check_space,
where usleep_range could be called while kmap_atomic is held. This
fixes issue below.

v2: Pass page actual numbers to i915_gem_object_vmap(). Also, check
return value for error handling. (Tvrtko Ursulin)
v1: vmap is done by i915_gem_object_vmap().

[   34.098798] BUG: scheduling while atomic: gem_close_race/1941/0x0002
[   34.098822] Modules linked in: hid_generic usbhid i915 asix usbnet libphy 
mii i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect 
sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid hid video 
pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1 e1000e ptp psmouse 
pps_core ahci libahci
[   34.098824] CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G U  
4.4.0-160121+ #123
[   34.098824] Hardware name: Intel Corporation Skylake Client platform/Skylake 
AIO DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
[   34.098825]  00013e40 880166c27a78 81280d02 
880172c13e40
[   34.098826]  880166c27a88 810c203a 880166c27ac8 
814ec808
[   34.098827]  88016b7c6000 880166c28000 000f4240 
0001
[   34.098827] Call Trace:
[   34.098831]  [] dump_stack+0x4b/0x79
[   34.098833]  [] __schedule_bug+0x41/0x4f
[   34.098834]  [] __schedule+0x5a8/0x690
[   34.098835]  [] schedule+0x37/0x80
[   34.098836]  [] schedule_hrtimeout_range_clock+0xad/0x130
[   34.098837]  [] ? hrtimer_init+0x10/0x10
[   34.098838]  [] ? schedule_hrtimeout_range_clock+0xa1/0x130
[   34.098839]  [] schedule_hrtimeout_range+0xe/0x10
[   34.098840]  [] usleep_range+0x3b/0x40
[   34.098853]  [] i915_guc_wq_check_space+0x119/0x210 [i915]
[   34.098861]  [] 
intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
[   34.098869]  [] i915_gem_request_alloc+0x91/0x170 [i915]
[   34.098875]  [] 
i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
[   34.098882]  [] ? 
i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
[   34.098889]  [] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
[   34.098895]  [] i915_gem_execbuffer2+0xa8/0x250 [i915]
[   34.098900]  [] drm_ioctl+0x258/0x4f0 [drm]
[   34.098906]  [] ? i915_gem_execbuffer+0x340/0x340 [i915]
[   34.098908]  [] do_vfs_ioctl+0x2cd/0x4a0
[   34.098909]  [] ? __fget+0x72/0xb0
[   34.098910]  [] SyS_ioctl+0x3c/0x70
[   34.098911]  [] entry_SYSCALL_64_fastpath+0x12/0x6a
[   34.100208] [ cut here ]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93847
Cc: Dave Gordon 
Cc: Daniel Vetter 
Cc: Tvrtko Ursulin 
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 56 ++
 drivers/gpu/drm/i915/intel_guc.h   |  3 +-
 2 files changed, 21 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d7543ef..3e2ea42 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,11 +195,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
struct guc_process_desc *desc;
union guc_doorbell_qw db_cmp, db_exc, db_ret;
union guc_doorbell_qw *db;
-   void *base;
int attempt = 2, ret = -EAGAIN;
 
-   base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
-   desc = base + gc->proc_desc_offset;
+   desc = gc->client_base + gc->proc_desc_offset;
 
/* Update the tail so it is visible to GuC */
desc->tail = gc->wq_tail;
@@ -215,7 +213,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
 
/* pointer of current doorbell cacheline */
-   db = base + gc->doorbell_offset;
+   db = gc->client_base + gc->doorbell_offset;
 
while (attempt--) {
/* lets ring the doorbell */
@@ -244,10 +242,6 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
}
 
-   /* Finally, update the cached copy of the GuC's WQ head */
-   gc->wq_head = desc->head;
-
-   kunmap_atomic(base);
return ret;
 }
 
@@ -341,10 +335,8 @@ static void guc_init_proc_desc(struct intel_guc *guc,
   struct i915_guc_client *client)
 {
struct guc_process_desc *desc;
-   void *base;
 
-   base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
-   desc = base + client->proc_desc_offset;
+   desc = client->client_base + client->proc_desc_offset;
 
memset(desc, 0, sizeof(*desc));
 
@@ -361,8 +353,6 @@ 

[Intel-gfx] [PATCH 0/2] Add i915_gem_object_vmap

2016-02-17 Thread yu . dai
From: Alex Dai 

There are several places in driver that a GEM object is mapped to kernel
virtual space. Now add a common function i915_gem_object_vmap to do the vmap
work for such use case.

Alex Dai (2):
  drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
  drm/i915/guc: Simplify code by keeping vmap of guc_client object

 drivers/gpu/drm/i915/i915_cmd_parser.c | 28 +---
 drivers/gpu/drm/i915/i915_drv.h|  3 ++
 drivers/gpu/drm/i915/i915_gem.c| 44 +
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 15 ++---
 drivers/gpu/drm/i915/i915_guc_submission.c | 53 +-
 drivers/gpu/drm/i915/intel_guc.h   |  3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c| 24 ++
 7 files changed, 70 insertions(+), 100 deletions(-)

-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915/guc: Simplify code by keeping vmap of guc_client object

2016-02-17 Thread yu . dai
From: Alex Dai 

GuC client object is always pinned during its life cycle. We cache
the vmap of client object, which includes guc_process_desc, doorbell
and work queue. By doing so, we can simplify the code where driver
communicate with GuC.

As a result, this patch removes the kmap_atomic in wq_check_space,
where usleep_range could be called while kmap_atomic is held. This
fixes issue below.

[   34.098798] BUG: scheduling while atomic: gem_close_race/1941/0x0002
[   34.098822] Modules linked in: hid_generic usbhid i915 asix usbnet libphy 
mii i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect 
sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid hid video 
pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1 e1000e ptp psmouse 
pps_core ahci libahci
[   34.098824] CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G U  
4.4.0-160121+ #123
[   34.098824] Hardware name: Intel Corporation Skylake Client platform/Skylake 
AIO DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
[   34.098825]  00013e40 880166c27a78 81280d02 
880172c13e40
[   34.098826]  880166c27a88 810c203a 880166c27ac8 
814ec808
[   34.098827]  88016b7c6000 880166c28000 000f4240 
0001
[   34.098827] Call Trace:
[   34.098831]  [] dump_stack+0x4b/0x79
[   34.098833]  [] __schedule_bug+0x41/0x4f
[   34.098834]  [] __schedule+0x5a8/0x690
[   34.098835]  [] schedule+0x37/0x80
[   34.098836]  [] schedule_hrtimeout_range_clock+0xad/0x130
[   34.098837]  [] ? hrtimer_init+0x10/0x10
[   34.098838]  [] ? schedule_hrtimeout_range_clock+0xa1/0x130
[   34.098839]  [] schedule_hrtimeout_range+0xe/0x10
[   34.098840]  [] usleep_range+0x3b/0x40
[   34.098853]  [] i915_guc_wq_check_space+0x119/0x210 [i915]
[   34.098861]  [] 
intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
[   34.098869]  [] i915_gem_request_alloc+0x91/0x170 [i915]
[   34.098875]  [] 
i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
[   34.098882]  [] ? 
i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
[   34.098889]  [] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
[   34.098895]  [] i915_gem_execbuffer2+0xa8/0x250 [i915]
[   34.098900]  [] drm_ioctl+0x258/0x4f0 [drm]
[   34.098906]  [] ? i915_gem_execbuffer+0x340/0x340 [i915]
[   34.098908]  [] do_vfs_ioctl+0x2cd/0x4a0
[   34.098909]  [] ? __fget+0x72/0xb0
[   34.098910]  [] SyS_ioctl+0x3c/0x70
[   34.098911]  [] entry_SYSCALL_64_fastpath+0x12/0x6a
[   34.100208] [ cut here ]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93847
Cc: Dave Gordon 
Cc: Daniel Vetter 
Cc: Tvrtko Ursulin 
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 53 +-
 drivers/gpu/drm/i915/intel_guc.h   |  3 +-
 2 files changed, 18 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d7543ef..3d6a1f7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,11 +195,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
struct guc_process_desc *desc;
union guc_doorbell_qw db_cmp, db_exc, db_ret;
union guc_doorbell_qw *db;
-   void *base;
int attempt = 2, ret = -EAGAIN;
 
-   base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
-   desc = base + gc->proc_desc_offset;
+   desc = gc->client_base + gc->proc_desc_offset;
 
/* Update the tail so it is visible to GuC */
desc->tail = gc->wq_tail;
@@ -215,7 +213,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
 
/* pointer of current doorbell cacheline */
-   db = base + gc->doorbell_offset;
+   db = gc->client_base + gc->doorbell_offset;
 
while (attempt--) {
/* lets ring the doorbell */
@@ -244,10 +242,6 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
}
 
-   /* Finally, update the cached copy of the GuC's WQ head */
-   gc->wq_head = desc->head;
-
-   kunmap_atomic(base);
return ret;
 }
 
@@ -341,10 +335,8 @@ static void guc_init_proc_desc(struct intel_guc *guc,
   struct i915_guc_client *client)
 {
struct guc_process_desc *desc;
-   void *base;
 
-   base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
-   desc = base + client->proc_desc_offset;
+   desc = client->client_base + client->proc_desc_offset;
 
memset(desc, 0, sizeof(*desc));
 
@@ -361,8 +353,6 @@ static void guc_init_proc_desc(struct intel_guc *guc,
desc->wq_size_bytes = client->wq_size;
desc->wq_status = WQ_STATUS_ACTIVE;
desc->priority = 

[Intel-gfx] [PATCH 1/2] drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space

2016-02-17 Thread yu . dai
From: Alex Dai 

There are several places inside driver where a GEM object is mapped to
kernel virtual space. The mapping is either done for the whole object
or certain page range of it.

This patch introduces a function i915_gem_object_vmap to do such job.

Signed-off-by: Alex Dai 
Cc: Dave Gordon 
Cc: Daniel Vetter 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c  | 28 +
 drivers/gpu/drm/i915/i915_drv.h |  3 +++
 drivers/gpu/drm/i915/i915_gem.c | 44 +
 drivers/gpu/drm/i915/i915_gem_dmabuf.c  | 15 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++
 5 files changed, 52 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 814d894..915e8c1 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -863,37 +863,11 @@ find_reg(const struct drm_i915_reg_descriptor *table,
 static u32 *vmap_batch(struct drm_i915_gem_object *obj,
   unsigned start, unsigned len)
 {
-   int i;
-   void *addr = NULL;
-   struct sg_page_iter sg_iter;
int first_page = start >> PAGE_SHIFT;
int last_page = (len + start + 4095) >> PAGE_SHIFT;
int npages = last_page - first_page;
-   struct page **pages;
-
-   pages = drm_malloc_ab(npages, sizeof(*pages));
-   if (pages == NULL) {
-   DRM_DEBUG_DRIVER("Failed to get space for pages\n");
-   goto finish;
-   }
-
-   i = 0;
-   for_each_sg_page(obj->pages->sgl, _iter, obj->pages->nents, 
first_page) {
-   pages[i++] = sg_page_iter_page(_iter);
-   if (i == npages)
-   break;
-   }
-
-   addr = vmap(pages, i, 0, PAGE_KERNEL);
-   if (addr == NULL) {
-   DRM_DEBUG_DRIVER("Failed to vmap pages\n");
-   goto finish;
-   }
 
-finish:
-   if (pages)
-   drm_free_large(pages);
-   return (u32*)addr;
+   return (u32*)i915_gem_object_vmap(obj, first_page, npages);
 }
 
 /* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6644c2e..5b00a6a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2899,6 +2899,9 @@ struct drm_i915_gem_object 
*i915_gem_object_create_from_data(
struct drm_device *dev, const void *data, size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
 void i915_gem_vma_destroy(struct i915_vma *vma);
+void *i915_gem_object_vmap(struct drm_i915_gem_object *obj,
+  unsigned int first,
+  unsigned int npages);
 
 /* Flags used by pin/bind */
 #define PIN_MAPPABLE   (1<<0)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f68f346..a6f465b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5356,3 +5356,47 @@ fail:
drm_gem_object_unreference(>base);
return ERR_PTR(ret);
 }
+
+/**
+ * i915_gem_object_vmap - map a GEM obj into kernel virtual space
+ * @obj: the GEM obj to be mapped
+ * @first: index of the first page where mapping starts
+ * @npages: how many pages to be mapped, starting from first page
+ *
+ * Map a given page range of GEM obj into kernel virtual space. The caller must
+ * make sure the associated pages are gathered and pinned before calling this
+ * function. vunmap should be called after use.
+ *
+ * NULL will be returned if fails.
+ */
+void *i915_gem_object_vmap(struct drm_i915_gem_object *obj,
+  unsigned int first,
+  unsigned int npages)
+{
+   struct sg_page_iter sg_iter;
+   struct page **pages;
+   void *addr;
+   int i;
+
+   if (first + npages > obj->pages->nents) {
+   DRM_DEBUG_DRIVER("Invalid page count\n");
+   return NULL;
+   }
+
+   pages = drm_malloc_ab(npages, sizeof(*pages));
+   if (pages == NULL) {
+   DRM_DEBUG_DRIVER("Failed to get space for pages\n");
+   return NULL;
+   }
+
+   i = 0;
+   for_each_sg_page(obj->pages->sgl, _iter, npages, first)
+   pages[i++] = sg_page_iter_page(_iter);
+
+   addr = vmap(pages, npages, 0, PAGE_KERNEL);
+   if (addr == NULL)
+   DRM_DEBUG_DRIVER("Failed to vmap pages\n");
+   drm_free_large(pages);
+
+   return addr;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 1f3eef6..d269957 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -110,9 +110,7 @@ static void *i915_gem_dmabuf_vmap(struct 

Re: [Intel-gfx] [PATCH 1/2] drm/i915/guc: Simplify code by keeping kmap of guc_client object

2016-02-17 Thread Yu Dai



On 02/17/2016 08:04 AM, Daniel Vetter wrote:

On Tue, Feb 16, 2016 at 08:47:07AM -0800, Yu Dai wrote:
>
>
> On 02/15/2016 07:23 AM, Dave Gordon wrote:
> >On 12/02/16 13:03, Tvrtko Ursulin wrote:
> >>
> >> On 11/02/16 23:09, yu@intel.com wrote:
> >>> From: Alex Dai <yu@intel.com>
> >>>
> >>> GuC client object is always pinned during its life cycle. We cache
> >>> the kmap of its first page, which includes guc_process_desc and
> >>> doorbell. By doing so, we can simplify the code where we read from
> >>> this page to get where GuC is progressing on work queue; and the
> >>> code where driver program doorbell to send work queue item to GuC.
> >
> >There's still one k(un)map_atomic() pair, in guc_add_workqueue_item().
> >Maybe we could get rid of that one too? So instead of kmapping only the
> >first page of the client, we could vmap() all three pages and so not
> >need to kmap_atomic() the WQ pages on the fly.
> >
> >There's a handy vmap_obj() function we might use, except it's currently
> >static ...
> >
> >
> Yes, there is a vmap_obj we can use but it is static. Actually two,
> vmap_batch() in i915_cmd_parser.c and vmap_obj() in intel_ringbuffer.c.
> Maybe it is a good idea to make it global, so GuC can use it too.

There should be a vmap function somewhere in the dma-buf code too iirc.


Yes, i915_gem_dmabuf_vmap. Let me try to make a common helper function can be 
shared.

Alex

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915/guc: Simplify code by keeping kmap of guc_client object

2016-02-16 Thread Yu Dai



On 02/15/2016 07:23 AM, Dave Gordon wrote:

On 12/02/16 13:03, Tvrtko Ursulin wrote:
>
> On 11/02/16 23:09, yu@intel.com wrote:
>> From: Alex Dai 
>>
>> GuC client object is always pinned during its life cycle. We cache
>> the kmap of its first page, which includes guc_process_desc and
>> doorbell. By doing so, we can simplify the code where we read from
>> this page to get where GuC is progressing on work queue; and the
>> code where driver program doorbell to send work queue item to GuC.

There's still one k(un)map_atomic() pair, in guc_add_workqueue_item().
Maybe we could get rid of that one too? So instead of kmapping only the
first page of the client, we could vmap() all three pages and so not
need to kmap_atomic() the WQ pages on the fly.

There's a handy vmap_obj() function we might use, except it's currently
static ...


Yes, there is a vmap_obj we can use but it is static. Actually two, 
vmap_batch() in i915_cmd_parser.c and vmap_obj() in intel_ringbuffer.c. 
Maybe it is a good idea to make it global, so GuC can use it too.


Thanks,
Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915/guc: Simplify code by keeping kmap of guc_client object

2016-02-16 Thread Yu Dai



On 02/15/2016 06:39 AM, Dave Gordon wrote:

On 12/02/16 13:03, Tvrtko Ursulin wrote:
>
> On 11/02/16 23:09, yu@intel.com wrote:
>> From: Alex Dai 
>>
>> GuC client object is always pinned during its life cycle. We cache
>> the kmap of its first page, which includes guc_process_desc and
>> doorbell. By doing so, we can simplify the code where we read from
>> this page to get where GuC is progressing on work queue; and the
>> code where driver program doorbell to send work queue item to GuC.

[snip]

>>
>> -/* Finally, update the cached copy of the GuC's WQ head */
>> -gc->wq_head = desc->head;
>
> Did you mean to remove the above?

I wondered that too at first, but the answer is "yes" -- see below.

>>
>> +client->client_base = kmap(i915_gem_object_get_dirty_page(obj, 0));
>
> Was this another bug, that the page/object wasn't dirtied before?

It wouldn't have made any difference; the object is pinned in the GTT
forever, so it can't be swapped out or reclaimed.

>> -uint32_t wq_head;
>
> Hm ok I don't get why kmap caching means removing this as well?

'wq_head' was an optimisation so that we could check whether there was
known to be space in the workqueue without kmapping and reading the
process descriptor. Now that the client (which includes the process
descriptor) is permanently mapped, there's no advantage to caching the
head; we might just as well read the current value from 'desc->head'
each time.

> Btw I don't see patch 2/2 ?
>

My bad, there is no 2/2. Thanks Dave for answering the questions. I have 
no more comments. :-)


Thanks,
Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/2] drm/i915/guc: Simplify code by keeping kmap of guc_client object

2016-02-11 Thread yu . dai
From: Alex Dai 

GuC client object is always pinned during its life cycle. We cache
the kmap of its first page, which includes guc_process_desc and
doorbell. By doing so, we can simplify the code where we read from
this page to get where GuC is progressing on work queue; and the
code where driver program doorbell to send work queue item to GuC.

As a result, this patch removes the kmap_atomic in wq_check_space,
where usleep_range could be called while kmap_atomic is held. This
fixes issue below.

[   34.098798] BUG: scheduling while atomic: gem_close_race/1941/0x0002
[   34.098822] Modules linked in: hid_generic usbhid i915 asix usbnet libphy 
mii i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect 
sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid hid video 
pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1 e1000e ptp psmouse 
pps_core ahci libahci
[   34.098824] CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G U  
4.4.0-160121+ #123
[   34.098824] Hardware name: Intel Corporation Skylake Client platform/Skylake 
AIO DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
[   34.098825]  00013e40 880166c27a78 81280d02 
880172c13e40
[   34.098826]  880166c27a88 810c203a 880166c27ac8 
814ec808
[   34.098827]  88016b7c6000 880166c28000 000f4240 
0001
[   34.098827] Call Trace:
[   34.098831]  [] dump_stack+0x4b/0x79
[   34.098833]  [] __schedule_bug+0x41/0x4f
[   34.098834]  [] __schedule+0x5a8/0x690
[   34.098835]  [] schedule+0x37/0x80
[   34.098836]  [] schedule_hrtimeout_range_clock+0xad/0x130
[   34.098837]  [] ? hrtimer_init+0x10/0x10
[   34.098838]  [] ? schedule_hrtimeout_range_clock+0xa1/0x130
[   34.098839]  [] schedule_hrtimeout_range+0xe/0x10
[   34.098840]  [] usleep_range+0x3b/0x40
[   34.098853]  [] i915_guc_wq_check_space+0x119/0x210 [i915]
[   34.098861]  [] 
intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
[   34.098869]  [] i915_gem_request_alloc+0x91/0x170 [i915]
[   34.098875]  [] 
i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
[   34.098882]  [] ? 
i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
[   34.098889]  [] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
[   34.098895]  [] i915_gem_execbuffer2+0xa8/0x250 [i915]
[   34.098900]  [] drm_ioctl+0x258/0x4f0 [drm]
[   34.098906]  [] ? i915_gem_execbuffer+0x340/0x340 [i915]
[   34.098908]  [] do_vfs_ioctl+0x2cd/0x4a0
[   34.098909]  [] ? __fget+0x72/0xb0
[   34.098910]  [] SyS_ioctl+0x3c/0x70
[   34.098911]  [] entry_SYSCALL_64_fastpath+0x12/0x6a
[   34.100208] [ cut here ]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93847
Cc: 
Cc: 
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 39 +-
 drivers/gpu/drm/i915/intel_guc.h   |  3 ++-
 2 files changed, 14 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d7543ef..d51015e 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,11 +195,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
struct guc_process_desc *desc;
union guc_doorbell_qw db_cmp, db_exc, db_ret;
union guc_doorbell_qw *db;
-   void *base;
int attempt = 2, ret = -EAGAIN;
 
-   base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
-   desc = base + gc->proc_desc_offset;
+   desc = gc->client_base + gc->proc_desc_offset;
 
/* Update the tail so it is visible to GuC */
desc->tail = gc->wq_tail;
@@ -215,7 +213,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
 
/* pointer of current doorbell cacheline */
-   db = base + gc->doorbell_offset;
+   db = gc->client_base + gc->doorbell_offset;
 
while (attempt--) {
/* lets ring the doorbell */
@@ -244,10 +242,6 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
}
 
-   /* Finally, update the cached copy of the GuC's WQ head */
-   gc->wq_head = desc->head;
-
-   kunmap_atomic(base);
return ret;
 }
 
@@ -341,10 +335,8 @@ static void guc_init_proc_desc(struct intel_guc *guc,
   struct i915_guc_client *client)
 {
struct guc_process_desc *desc;
-   void *base;
 
-   base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
-   desc = base + client->proc_desc_offset;
+   desc = client->client_base + client->proc_desc_offset;
 
memset(desc, 0, sizeof(*desc));
 
@@ -361,8 +353,6 @@ static void guc_init_proc_desc(struct intel_guc *guc,
desc->wq_size_bytes = client->wq_size;
desc->wq_status = 

Re: [Intel-gfx] [PATCH] drm/i915/guc: Set init value for cached work queue head

2016-02-10 Thread Yu Dai



On 02/10/2016 09:30 AM, Tvrtko Ursulin wrote:

Hi,

On 10/02/16 00:05, yu@intel.com wrote:
> From: Alex Dai 
>
> The cached work queue header pointer is set to last byte of work
> queue buffer. It will make sure the whole work queue buffer is
> available after coming back from reset or init.
>
> Do not hold kmap_atomic mapping before going to sleep when work
> queue is full.

Could you please split this into two patches? They are two completely
separate issues and it is customary to do so.

For the kmap_atomic issue you can also reference
https://bugs.freedesktop.org/show_bug.cgi?id=93847 in the commit message.


Yes, will do.

> Signed-off-by: Alex Dai 
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 10 +-
>   1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
> index d7543ef..41f4a96 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -486,11 +486,11 @@ int i915_guc_wq_check_space(struct i915_guc_client *gc)
>if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
>return 0;
>
> -  base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
> -  desc = base + gc->proc_desc_offset;
> -
>while (timeout_counter-- > 0) {
> +  base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
> +  desc = base + gc->proc_desc_offset;
>gc->wq_head = desc->head;
> +  kunmap_atomic(base);
>
>if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
>ret = 0;
> @@ -501,8 +501,6 @@ int i915_guc_wq_check_space(struct i915_guc_client *gc)
>usleep_range(1000, 2000);
>};
>
> -  kunmap_atomic(base);
> -
>return ret;
>   }

This part is OK to extinguish this fire. But in general you could also
consider caching the kmap in the client since it looks to me that object
is persistently pinned for its lifetime. So kmap_atomic just complicates
things.


Yes this object must be pinned for its lifetime as it is used by GuC 
internally too. I will think about a way to cache it.



> @@ -730,6 +728,8 @@ static struct i915_guc_client *guc_client_alloc(struct 
drm_device *dev,
>client->client_obj = obj;
>client->wq_offset = GUC_DB_SIZE;
>client->wq_size = GUC_WQ_SIZE;
> +  client->wq_head = GUC_WQ_SIZE - 1;
> +  client->wq_tail = 0;
>
>client->doorbell_offset = select_doorbell_cacheline(guc);
>
>

This one I can't really figure out without I suppose knowing more about
the code design. How come it was OK when it was zero (apart after reset)?

The value is otherwise only updated from the GuC shared page and a
driver does not appear to modify it. Perhaps just a better commit
message to explain things?


The way this kernel CIRC_xx works is it leaves one byte free and treat 
head == tail case as empty. So, there won't be a problem if this head 
happens to be 0. If it comes with some random number between [1, 
sizeof(WQ item)], there will be a SW dead looping in driver.


And, I will split this patch into two ones.

Thanks,
Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Set init value for cached work queue head

2016-02-09 Thread yu . dai
From: Alex Dai 

The cached work queue header pointer is set to last byte of work
queue buffer. It will make sure the whole work queue buffer is
available after coming back from reset or init.

Do not hold kmap_atomic mapping before going to sleep when work
queue is full.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d7543ef..41f4a96 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -486,11 +486,11 @@ int i915_guc_wq_check_space(struct i915_guc_client *gc)
if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
return 0;
 
-   base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
-   desc = base + gc->proc_desc_offset;
-
while (timeout_counter-- > 0) {
+   base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
+   desc = base + gc->proc_desc_offset;
gc->wq_head = desc->head;
+   kunmap_atomic(base);
 
if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
ret = 0;
@@ -501,8 +501,6 @@ int i915_guc_wq_check_space(struct i915_guc_client *gc)
usleep_range(1000, 2000);
};
 
-   kunmap_atomic(base);
-
return ret;
 }
 
@@ -730,6 +728,8 @@ static struct i915_guc_client *guc_client_alloc(struct 
drm_device *dev,
client->client_obj = obj;
client->wq_offset = GUC_DB_SIZE;
client->wq_size = GUC_WQ_SIZE;
+   client->wq_head = GUC_WQ_SIZE - 1;
+   client->wq_tail = 0;
 
client->doorbell_offset = select_doorbell_cacheline(guc);
 
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 3/6] drm/i915/huc: Unified css_header struct for GuC and HuC

2016-02-08 Thread yu . dai
From: Alex Dai 

HuC firmware css header has almost exactly same definition as GuC
firmware except for the sw_version. Also, add a new member fw_type
into intel_uc_fw to indicate what kind of fw it is. So, the loader
will pull right sw_version from header.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/intel_guc.h|  4 
 drivers/gpu/drm/i915/intel_guc_fwif.h   | 16 ++---
 drivers/gpu/drm/i915/intel_guc_loader.c | 42 +
 3 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 4e20b0c..3b26e13 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -59,6 +59,9 @@ enum intel_uc_fw_status {
UC_FIRMWARE_SUCCESS
 };
 
+#define UC_FW_TYPE_GUC 0
+#define UC_FW_TYPE_HUC 1
+
 /*
  * This structure encapsulates all the data needed during the process
  * of fetching, caching, and loading the firmware image into the GuC.
@@ -76,6 +79,7 @@ struct intel_uc_fw {
uint16_t major_ver_found;
uint16_t minor_ver_found;
 
+   uint32_t fw_type;
uint32_t header_size;
uint32_t header_offset;
uint32_t rsa_size;
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 2de57ff..a7bedc8 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -153,7 +153,7 @@
  * The GuC firmware layout looks like this:
  *
  * +---+
- * |guc_css_header |
+ * | uc_css_header |
  * | contains major/minor version  |
  * +---+
  * | uCode |
@@ -179,9 +179,16 @@
  * 3. Length info of each component can be found in header, in dwords.
  * 4. Modulus and exponent key are not required by driver. They may not appear
  * in fw. So driver will load a truncated firmware in this case.
+ *
+ * HuC firmware layout is same as GuC firmware.
+ *
+ * HuC firmware css header is different. However, the only difference is where
+ * the version information is saved. The uc_css_header is unified to support
+ * both. Driver should get HuC version from uc_css_header.huc_sw_version, while
+ * uc_css_header.guc_sw_version for GuC.
  */
 
-struct guc_css_header {
+struct uc_css_header {
uint32_t module_type;
/* header_size includes all non-uCode bits, including css_header, rsa
 * key, modulus key and exponent data. */
@@ -212,7 +219,10 @@ struct guc_css_header {
 
char username[8];
char buildnumber[12];
-   uint32_t device_id;
+   union {
+   uint32_t device_id;
+   uint32_t huc_sw_version;
+   };
uint32_t guc_sw_version;
uint32_t prod_preprod_fw;
uint32_t reserved[12];
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 482a5e4..261ae5b 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -465,7 +465,7 @@ void intel_uc_fw_fetch(struct drm_device *dev, struct 
intel_uc_fw *uc_fw)
 {
struct drm_i915_gem_object *obj;
const struct firmware *fw;
-   struct guc_css_header *css;
+   struct uc_css_header *css;
size_t size;
int err;
 
@@ -482,19 +482,19 @@ void intel_uc_fw_fetch(struct drm_device *dev, struct 
intel_uc_fw *uc_fw)
uc_fw->uc_fw_path, fw);
 
/* Check the size of the blob before examining buffer contents */
-   if (fw->size < sizeof(struct guc_css_header)) {
+   if (fw->size < sizeof(struct uc_css_header)) {
DRM_ERROR("Firmware header is missing\n");
goto fail;
}
 
-   css = (struct guc_css_header *)fw->data;
+   css = (struct uc_css_header *)fw->data;
 
/* Firmware bits always start from header */
uc_fw->header_offset = 0;
uc_fw->header_size = (css->header_size_dw - css->modulus_size_dw -
css->key_size_dw - css->exponent_size_dw) * sizeof(u32);
 
-   if (uc_fw->header_size != sizeof(struct guc_css_header)) {
+   if (uc_fw->header_size != sizeof(struct uc_css_header)) {
DRM_ERROR("CSS header definition mismatch\n");
goto fail;
}
@@ -518,23 +518,35 @@ void intel_uc_fw_fetch(struct drm_device *dev, struct 
intel_uc_fw *uc_fw)
goto fail;
}
 
-   /* Header and uCode will be loaded to WOPCM. Size of the two. */
-   size = uc_fw->header_size + uc_fw->ucode_size;
-
-   /* Top 32k of WOPCM is reserved (8K stack + 24k RC6 context). */
-   if (size > GUC_WOPCM_SIZE_VALUE - 0x8000) {
-   DRM_ERROR("Firmware is too large to fit in WOPCM\n");
-   goto fail;
-   }
-
/*
 

[Intel-gfx] [PATCH v2 0/6] Add HuC loading and authentication support

2016-02-08 Thread yu . dai
From: Alex Dai 

The current GuC loading helper functions have been utilized for HuC loading
too. The firmware css_header is unified for GuC and HuC. Be note that driver
init won't fail even HuC loading fails.

v2: Rebase to latest kernel.

Alex Dai (6):
  drm/i915/guc: Make the GuC fw loading helper functions general
  drm/i915/guc: Bypass fw loading gracefully if GuC is not supported
  drm/i915/huc: Unified css_header struct for GuC and HuC
  drm/i915/huc: Add HuC fw loading support
  drm/i915/huc: Add debugfs for HuC loading status check
  drm/i915/huc: Support HuC authentication

 drivers/gpu/drm/i915/Makefile  |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c|  44 -
 drivers/gpu/drm/i915/i915_dma.c|   3 +
 drivers/gpu/drm/i915/i915_drv.h|   3 +
 drivers/gpu/drm/i915/i915_gem.c|   7 +
 drivers/gpu/drm/i915/i915_guc_reg.h|   3 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  65 +++
 drivers/gpu/drm/i915/intel_guc.h   |  45 ++---
 drivers/gpu/drm/i915/intel_guc_fwif.h  |  17 +-
 drivers/gpu/drm/i915/intel_guc_loader.c| 242 +-
 drivers/gpu/drm/i915/intel_huc.h   |  44 +
 drivers/gpu/drm/i915/intel_huc_loader.c| 262 +
 12 files changed, 592 insertions(+), 144 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_huc.h
 create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c

-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 1/6] drm/i915/guc: Make the GuC fw loading helper functions general

2016-02-08 Thread yu . dai
From: Alex Dai 

Rename some of the GuC fw loading code to make them more general. We
will utilize them for HuC loading as well.
s/intel_guc_fw/intel_uc_fw/g
s/GUC_FIRMWARE/UC_FIRMWARE/g

Struct intel_guc_fw is renamed to intel_uc_fw. Prefix of tts members,
such as 'guc' or 'guc_fw' either is renamed to 'uc' or removed for
same purpose.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/i915_debugfs.c |  12 +--
 drivers/gpu/drm/i915/intel_guc.h|  39 +++
 drivers/gpu/drm/i915/intel_guc_loader.c | 177 +---
 3 files changed, 120 insertions(+), 108 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index ec0c2a05e..873f1b2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2402,7 +2402,7 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
 {
struct drm_info_node *node = m->private;
struct drm_i915_private *dev_priv = node->minor->dev->dev_private;
-   struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
+   struct intel_uc_fw *guc_fw = _priv->guc.guc_fw;
u32 tmp, i;
 
if (!HAS_GUC_UCODE(dev_priv->dev))
@@ -2410,15 +2410,15 @@ static int i915_guc_load_status_info(struct seq_file 
*m, void *data)
 
seq_printf(m, "GuC firmware status:\n");
seq_printf(m, "\tpath: %s\n",
-   guc_fw->guc_fw_path);
+   guc_fw->uc_fw_path);
seq_printf(m, "\tfetch: %s\n",
-   intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+   intel_uc_fw_status_repr(guc_fw->fetch_status));
seq_printf(m, "\tload: %s\n",
-   intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+   intel_uc_fw_status_repr(guc_fw->load_status));
seq_printf(m, "\tversion wanted: %d.%d\n",
-   guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+   guc_fw->major_ver_wanted, guc_fw->minor_ver_wanted);
seq_printf(m, "\tversion found: %d.%d\n",
-   guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found);
+   guc_fw->major_ver_found, guc_fw->minor_ver_found);
seq_printf(m, "\theader: offset is %d; size = %d\n",
guc_fw->header_offset, guc_fw->header_size);
seq_printf(m, "\tuCode: offset is %d; size = %d\n",
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 73002e9..4e20b0c 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -52,29 +52,29 @@ struct i915_guc_client {
int retcode;
 };
 
-enum intel_guc_fw_status {
-   GUC_FIRMWARE_FAIL = -1,
-   GUC_FIRMWARE_NONE = 0,
-   GUC_FIRMWARE_PENDING,
-   GUC_FIRMWARE_SUCCESS
+enum intel_uc_fw_status {
+   UC_FIRMWARE_FAIL = -1,
+   UC_FIRMWARE_NONE = 0,
+   UC_FIRMWARE_PENDING,
+   UC_FIRMWARE_SUCCESS
 };
 
 /*
  * This structure encapsulates all the data needed during the process
  * of fetching, caching, and loading the firmware image into the GuC.
  */
-struct intel_guc_fw {
-   struct drm_device * guc_dev;
-   const char *guc_fw_path;
-   size_t  guc_fw_size;
-   struct drm_i915_gem_object *guc_fw_obj;
-   enum intel_guc_fw_statusguc_fw_fetch_status;
-   enum intel_guc_fw_statusguc_fw_load_status;
-
-   uint16_tguc_fw_major_wanted;
-   uint16_tguc_fw_minor_wanted;
-   uint16_tguc_fw_major_found;
-   uint16_tguc_fw_minor_found;
+struct intel_uc_fw {
+   struct drm_device * uc_dev;
+   const char *uc_fw_path;
+   size_t  uc_fw_size;
+   struct drm_i915_gem_object *uc_fw_obj;
+   enum intel_uc_fw_status fetch_status;
+   enum intel_uc_fw_status load_status;
+
+   uint16_t major_ver_wanted;
+   uint16_t minor_ver_wanted;
+   uint16_t major_ver_found;
+   uint16_t minor_ver_found;
 
uint32_t header_size;
uint32_t header_offset;
@@ -85,7 +85,7 @@ struct intel_guc_fw {
 };
 
 struct intel_guc {
-   struct intel_guc_fw guc_fw;
+   struct intel_uc_fw guc_fw;
uint32_t log_flags;
struct drm_i915_gem_object *log_obj;
 
@@ -114,9 +114,10 @@ struct intel_guc {
 extern void intel_guc_ucode_init(struct drm_device *dev);
 extern int intel_guc_ucode_load(struct drm_device *dev);
 extern void intel_guc_ucode_fini(struct drm_device *dev);
-extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
+extern const char *intel_uc_fw_status_repr(enum intel_uc_fw_status status);
 extern int intel_guc_suspend(struct drm_device *dev);
 extern int intel_guc_resume(struct drm_device 

[Intel-gfx] [PATCH v2 5/6] drm/i915/huc: Add debugfs for HuC loading status check

2016-02-08 Thread yu . dai
From: Alex Dai 

Add debugfs entry for HuC loading status check.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 873f1b2..4521fe6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2398,6 +2398,37 @@ static int i915_llc(struct seq_file *m, void *data)
return 0;
 }
 
+static int i915_huc_load_status_info(struct seq_file *m, void *data)
+{
+   struct drm_info_node *node = m->private;
+   struct drm_i915_private *dev_priv = node->minor->dev->dev_private;
+   struct intel_uc_fw *huc_fw = _priv->huc.huc_fw;
+
+   if (!HAS_HUC_UCODE(dev_priv->dev))
+   return 0;
+
+   seq_printf(m, "HuC firmware status:\n");
+   seq_printf(m, "\tpath: %s\n", huc_fw->uc_fw_path);
+   seq_printf(m, "\tfetch: %s\n",
+   intel_uc_fw_status_repr(huc_fw->fetch_status));
+   seq_printf(m, "\tload: %s\n",
+   intel_uc_fw_status_repr(huc_fw->load_status));
+   seq_printf(m, "\tversion wanted: %d.%d\n",
+   huc_fw->major_ver_wanted, huc_fw->minor_ver_wanted);
+   seq_printf(m, "\tversion found: %d.%d\n",
+   huc_fw->major_ver_found, huc_fw->minor_ver_found);
+   seq_printf(m, "\theader: offset is %d; size = %d\n",
+   huc_fw->header_offset, huc_fw->header_size);
+   seq_printf(m, "\tuCode: offset is %d; size = %d\n",
+   huc_fw->ucode_offset, huc_fw->ucode_size);
+   seq_printf(m, "\tRSA: offset is %d; size = %d\n",
+   huc_fw->rsa_offset, huc_fw->rsa_size);
+
+   seq_printf(m, "\nHuC status 0x%08x:\n", I915_READ(HUC_STATUS2));
+
+   return 0;
+}
+
 static int i915_guc_load_status_info(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
@@ -5347,6 +5378,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_guc_info", i915_guc_info, 0},
{"i915_guc_load_status", i915_guc_load_status_info, 0},
{"i915_guc_log_dump", i915_guc_log_dump, 0},
+   {"i915_huc_load_status", i915_huc_load_status_info, 0},
{"i915_frequency_info", i915_frequency_info, 0},
{"i915_hangcheck_info", i915_hangcheck_info, 0},
{"i915_drpc_info", i915_drpc_info, 0},
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 2/6] drm/i915/guc: Bypass fw loading gracefully if GuC is not supported

2016-02-08 Thread yu . dai
From: Alex Dai 

This is to rework previous patch:

commit 9f9e539f90bcecfdc7b3679d337b7a62d4313205
Author: Daniel Vetter 
Date:   Fri Oct 23 11:10:59 2015 +0200

drm/i915: Shut up GuC errors when it's disabled

There is the case where GuC loading is needed even GuC submission
is disabled. For example, HuC loading and authentication require
GuC to be loaded regardless. In this patch, driver will try to load
the firmware only when it explicitly asks for that by specifying fw
name and version. All other cases are considered as UC_FIRMWARE_NONE
and the loading is bypassed silently.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 318b5fd..482a5e4 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -597,39 +597,29 @@ void intel_guc_ucode_init(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_uc_fw *guc_fw = _priv->guc.guc_fw;
-   const char *fw_path;
+   const char *fw_path = NULL;
+
+   guc_fw->uc_dev = dev;
+   guc_fw->uc_fw_path = NULL;
+   guc_fw->fetch_status = UC_FIRMWARE_NONE;
+   guc_fw->load_status = UC_FIRMWARE_NONE;
 
if (!HAS_GUC_SCHED(dev))
i915.enable_guc_submission = false;
 
-   if (!HAS_GUC_UCODE(dev)) {
-   fw_path = NULL;
-   } else if (IS_SKYLAKE(dev)) {
+   if (!HAS_GUC_UCODE(dev))
+   return;
+
+   if (IS_SKYLAKE(dev)) {
fw_path = I915_SKL_GUC_UCODE;
guc_fw->major_ver_wanted = 4;
guc_fw->minor_ver_wanted = 3;
-   } else {
-   i915.enable_guc_submission = false;
-   fw_path = "";   /* unknown device */
}
 
-   if (!i915.enable_guc_submission)
-   return;
-
-   guc_fw->uc_dev = dev;
-   guc_fw->uc_fw_path = fw_path;
-   guc_fw->fetch_status = UC_FIRMWARE_NONE;
-   guc_fw->load_status = UC_FIRMWARE_NONE;
-
if (fw_path == NULL)
return;
 
-   if (*fw_path == '\0') {
-   DRM_ERROR("No GuC firmware known for this platform\n");
-   guc_fw->fetch_status = UC_FIRMWARE_FAIL;
-   return;
-   }
-
+   guc_fw->uc_fw_path = fw_path;
guc_fw->fetch_status = UC_FIRMWARE_PENDING;
DRM_DEBUG_DRIVER("GuC firmware pending, path %s\n", fw_path);
intel_uc_fw_fetch(dev, guc_fw);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 4/6] drm/i915/huc: Add HuC fw loading support

2016-02-08 Thread yu . dai
From: Alex Dai 

The HuC loading process is similar to GuC. The intel_uc_fw_fetch()
is used for both cases.

HuC loading needs to be before GuC loading. The WOPCM setting must
be done early before loading any of them.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/Makefile   |   1 +
 drivers/gpu/drm/i915/i915_dma.c |   3 +
 drivers/gpu/drm/i915/i915_drv.h |   3 +
 drivers/gpu/drm/i915/i915_gem.c |   7 +
 drivers/gpu/drm/i915/i915_guc_reg.h |   3 +
 drivers/gpu/drm/i915/intel_guc_loader.c |   7 +-
 drivers/gpu/drm/i915/intel_huc.h|  44 ++
 drivers/gpu/drm/i915/intel_huc_loader.c | 262 
 8 files changed, 325 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_huc.h
 create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0851de07..693cc8f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -42,6 +42,7 @@ i915-y += i915_cmd_parser.o \
 
 # general-purpose microcontroller (GuC) support
 i915-y += intel_guc_loader.o \
+ intel_huc_loader.o \
  i915_guc_submission.o
 
 # autogenerated null render state
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 120125b..6526cf7 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -405,6 +405,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 * working irqs for e.g. gmbus and dp aux transfers. */
intel_modeset_init(dev);
 
+   intel_huc_ucode_init(dev);
intel_guc_ucode_init(dev);
 
ret = i915_gem_init(dev);
@@ -448,6 +449,7 @@ cleanup_gem:
i915_gem_context_fini(dev);
mutex_unlock(>struct_mutex);
 cleanup_irq:
+   intel_huc_ucode_fini(dev);
intel_guc_ucode_fini(dev);
drm_irq_uninstall(dev);
intel_teardown_gmbus(dev);
@@ -1251,6 +1253,7 @@ int i915_driver_unload(struct drm_device *dev)
/* Flush any outstanding unpin_work. */
flush_workqueue(dev_priv->wq);
 
+   intel_huc_ucode_fini(dev);
intel_guc_ucode_fini(dev);
mutex_lock(>struct_mutex);
i915_gem_cleanup_ringbuffer(dev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8216665..c9c7378 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include "intel_guc.h"
+#include "intel_huc.h"
 
 /* General customization:
  */
@@ -1744,6 +1745,7 @@ struct drm_i915_private {
 
struct i915_virtual_gpu vgpu;
 
+   struct intel_huc huc;
struct intel_guc guc;
 
struct intel_csr csr;
@@ -2673,6 +2675,7 @@ struct drm_i915_cmd_table {
 
 #define HAS_GUC_UCODE(dev) (IS_GEN9(dev) && !IS_KABYLAKE(dev))
 #define HAS_GUC_SCHED(dev) (IS_GEN9(dev) && !IS_KABYLAKE(dev))
+#define HAS_HUC_UCODE(dev) (IS_GEN9(dev) && !IS_KABYLAKE(dev))
 
 #define HAS_RESOURCE_STREAMER(dev) (IS_HASWELL(dev) || \
INTEL_INFO(dev)->gen >= 8)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e9b19bc..56f243f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4890,6 +4890,13 @@ i915_gem_init_hw(struct drm_device *dev)
 
/* We can't enable contexts until all firmware is loaded */
if (HAS_GUC_UCODE(dev)) {
+   /* init WOPCM */
+   I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
+   I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET_VALUE |
+   HUC_LOADING_AGENT_GUC);
+
+   intel_huc_ucode_load(dev);
+
ret = intel_guc_ucode_load(dev);
if (ret) {
DRM_ERROR("Failed to initialize GuC, error %d\n", ret);
diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index e4ba582..8d27c09 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -52,9 +52,12 @@
 #define   DMA_ADDRESS_SPACE_GTT  (8 << 16)
 #define DMA_COPY_SIZE  _MMIO(0xc310)
 #define DMA_CTRL   _MMIO(0xc314)
+#define   HUC_UKERNEL(1<<9)
 #define   UOS_MOVE   (1<<4)
 #define   START_DMA  (1<<0)
 #define DMA_GUC_WOPCM_OFFSET   _MMIO(0xc340)
+#define   HUC_LOADING_AGENT_VCR  (0<<1)
+#define   HUC_LOADING_AGENT_GUC  (1<<1)
 #define   GUC_WOPCM_OFFSET_VALUE 0x8   /* 512KB */
 #define GUC_MAX_IDLE_COUNT _MMIO(0xC3E4)
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 261ae5b..ac0d9e7 100644
--- 

[Intel-gfx] [PATCH v2 6/6] drm/i915/huc: Support HuC authentication

2016-02-08 Thread yu . dai
From: Alex Dai 

The HuC authentication is done by host2guc call. The HuC RSA keys
are sent to GuC for authentication.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 65 ++
 drivers/gpu/drm/i915/intel_guc_fwif.h  |  1 +
 drivers/gpu/drm/i915/intel_guc_loader.c|  2 +
 3 files changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d7543ef..01ab55c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -25,6 +25,7 @@
 #include 
 #include "i915_drv.h"
 #include "intel_guc.h"
+#include "intel_huc.h"
 
 /**
  * DOC: GuC-based command submission
@@ -1027,3 +1028,67 @@ int intel_guc_resume(struct drm_device *dev)
 
return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
+
+/**
+ * intel_huc_ucode_auth() - authenticate ucode
+ * @dev: the drm device
+ *
+ * Triggers a HuC fw authentication request to the GuC via host-2-guc
+ * interface.
+ */
+void intel_huc_ucode_auth(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct intel_guc *guc = _priv->guc;
+   struct intel_huc *huc = _priv->huc;
+   int ret;
+   u32 data[2];
+
+   /* Bypass the case where there is no HuC firmware */
+   if (huc->huc_fw.fetch_status == UC_FIRMWARE_NONE ||
+   huc->huc_fw.load_status == UC_FIRMWARE_NONE)
+   return;
+
+   if (guc->guc_fw.load_status != UC_FIRMWARE_SUCCESS) {
+   DRM_ERROR("HuC: GuC fw wasn't loaded. Can't authenticate");
+   return;
+   }
+
+   if (huc->huc_fw.load_status != UC_FIRMWARE_SUCCESS) {
+   DRM_ERROR("HuC: fw wasn't loaded. Nothing to authenticate");
+   return;
+   }
+
+   ret = i915_gem_obj_ggtt_pin(huc->huc_fw.uc_fw_obj, 0, 0);
+   if (ret) {
+   DRM_ERROR("HuC: Pin failed");
+   return;
+   }
+
+   /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
+   I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
+   /* Specify auth action and where public signature is. It's stored
+* at the beginning of the gem object, before the fw bits
+*/
+   data[0] = HOST2GUC_ACTION_AUTHENTICATE_HUC;
+   data[1] = i915_gem_obj_ggtt_offset(huc->huc_fw.uc_fw_obj) +
+   huc->huc_fw.rsa_offset;
+
+   ret = host2guc_action(guc, data, ARRAY_SIZE(data));
+   if (ret) {
+   DRM_ERROR("HuC: GuC did not ack Auth request\n");
+   goto out;
+   }
+
+   /* Check authentication status, it should be done by now */
+   ret = wait_for_atomic(
+   (I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED) > 0, 5000);
+   if (ret) {
+   DRM_ERROR("HuC: Authentication failed\n");
+   goto out;
+   }
+
+out:
+   i915_gem_object_ggtt_unpin(huc->huc_fw.uc_fw_obj);
+}
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index a7bedc8..28280dd 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -436,6 +436,7 @@ enum host2guc_action {
HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+   HOST2GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
HOST2GUC_ACTION_LIMIT
 };
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index ac0d9e7..b1d8ad1 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -424,6 +424,8 @@ int intel_guc_ucode_load(struct drm_device *dev)
intel_uc_fw_status_repr(guc_fw->fetch_status),
intel_uc_fw_status_repr(guc_fw->load_status));
 
+   intel_huc_ucode_auth(dev);
+
if (i915.enable_guc_submission) {
/* The execbuf_client will be recreated. Release it first. */
i915_guc_submission_disable(dev);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2] drm/i915/guc: Decouple GuC engine id from ring id

2016-01-23 Thread yu . dai
From: Alex Dai 

Previously GuC uses ring id as engine id because of same definition.
But this is not true since this commit:

commit de1add360522c876c25ef2ab1c94bdb509ab
Author: Tvrtko Ursulin 
Date:   Fri Jan 15 15:12:50 2016 +

drm/i915: Decouple execbuf uAPI from internal implementation

Added GuC engine id into GuC interface to decouple it from ring id used
by driver.

v2: Keep ring name print out in debugfs; using for_each_ring() where
possible to keep driver consistent. (Chris W.)

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 12 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 38 ++
 drivers/gpu/drm/i915/intel_guc.h   |  6 ++---
 drivers/gpu/drm/i915/intel_guc_fwif.h  | 17 +
 drivers/gpu/drm/i915/intel_lrc.c   |  5 
 drivers/gpu/drm/i915/intel_ringbuffer.h|  1 +
 6 files changed, 45 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c5db235..cea1844 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2463,9 +2463,9 @@ static void i915_guc_client_info(struct seq_file *m,
 
for_each_ring(ring, dev_priv, i) {
seq_printf(m, "\tSubmissions: %llu %s\n",
-   client->submissions[i],
+   client->submissions[ring->guc_id],
ring->name);
-   tot += client->submissions[i];
+   tot += client->submissions[ring->guc_id];
}
seq_printf(m, "\tTotal: %llu\n", tot);
 }
@@ -2502,10 +2502,10 @@ static int i915_guc_info(struct seq_file *m, void *data)
 
seq_printf(m, "\nGuC submissions:\n");
for_each_ring(ring, dev_priv, i) {
-   seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x %9d\n",
-   ring->name, guc.submissions[i],
-   guc.last_seqno[i], guc.last_seqno[i]);
-   total += guc.submissions[i];
+   seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x\n",
+   ring->name, guc.submissions[ring->guc_id],
+   guc.last_seqno[ring->guc_id]);
+   total += guc.submissions[ring->guc_id];
}
seq_printf(m, "\t%s: %llu\n", "Total", total);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 51ae5c1..b4da20f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -376,6 +376,8 @@ static void guc_init_proc_desc(struct intel_guc *guc,
 static void guc_init_ctx_desc(struct intel_guc *guc,
  struct i915_guc_client *client)
 {
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct intel_engine_cs *ring;
struct intel_context *ctx = client->owner;
struct guc_context_desc desc;
struct sg_table *sg;
@@ -388,10 +390,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
desc.priority = client->priority;
desc.db_id = client->doorbell_id;
 
-   for (i = 0; i < I915_NUM_RINGS; i++) {
-   struct guc_execlist_context *lrc = [i];
-   struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
-   struct intel_engine_cs *ring;
+   for_each_ring(ring, dev_priv, i) {
+   struct guc_execlist_context *lrc = [ring->guc_id];
struct drm_i915_gem_object *obj;
uint64_t ctx_desc;
 
@@ -406,7 +406,6 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
if (!obj)
break;  /* XXX: continue? */
 
-   ring = ringbuf->ring;
ctx_desc = intel_lr_context_descriptor(ctx, ring);
lrc->context_desc = (u32)ctx_desc;
 
@@ -414,16 +413,16 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
lrc->ring_lcra = i915_gem_obj_ggtt_offset(obj) +
LRC_STATE_PN * PAGE_SIZE;
lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
-   (ring->id << GUC_ELC_ENGINE_OFFSET);
+   (ring->guc_id << GUC_ELC_ENGINE_OFFSET);
 
-   obj = ringbuf->obj;
+   obj = ctx->engine[i].ringbuf->obj;
 
lrc->ring_begin = i915_gem_obj_ggtt_offset(obj);
lrc->ring_end = lrc->ring_begin + obj->base.size - 1;
lrc->ring_next_free_location = lrc->ring_begin;
lrc->ring_current_tail_pointer_value = 0;
 
-   desc.engines_used |= (1 << ring->id);
+   desc.engines_used |= (1 << ring->guc_id);
}
 
WARN_ON(desc.engines_used == 0);
@@ -510,7 +509,6 @@ int i915_guc_wq_check_space(struct i915_guc_client 

Re: [Intel-gfx] [PATCH] drm/i915/guc: Decouple GuC engine id from ring id

2016-01-23 Thread Yu Dai



On 01/23/2016 10:25 AM, Chris Wilson wrote:

On Fri, Jan 22, 2016 at 03:06:28PM -0800, yu@intel.com wrote:
> From: Alex Dai 
>
> Previously GuC uses ring id as engine id because of same definition.
> But this is not true since this commit:
>
> commit de1add360522c876c25ef2ab1c94bdb509ab
> Author: Tvrtko Ursulin 
> Date:   Fri Jan 15 15:12:50 2016 +
>
> drm/i915: Decouple execbuf uAPI from internal implementation
>
> Added GuC engine id into GuC interface to decouple it from ring id used
> by driver.
>
> Signed-off-by: Alex Dai 
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
> index c5db235..9a4e01e 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2446,7 +2446,6 @@ static void i915_guc_client_info(struct seq_file *m,
> struct drm_i915_private *dev_priv,
> struct i915_guc_client *client)
>  {
> -  struct intel_engine_cs *ring;
>uint64_t tot = 0;
>uint32_t i;
>
> @@ -2461,10 +2460,9 @@ static void i915_guc_client_info(struct seq_file *m,
>seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
>seq_printf(m, "\tLast submission result: %d\n", client->retcode);
>
> -  for_each_ring(ring, dev_priv, i) {
> -  seq_printf(m, "\tSubmissions: %llu %s\n",
> -  client->submissions[i],
> -  ring->name);
> +  for (i = GUC_RENDER_ENGINE; i < GUC_MAX_ENGINES_NUM; i++) {
> +  seq_printf(m, "\tSubmissions: %llu, engine %d\n",
> +  client->submissions[i], i);
>tot += client->submissions[i];
>}
>seq_printf(m, "\tTotal: %llu\n", tot);
> @@ -2477,7 +2475,6 @@ static int i915_guc_info(struct seq_file *m, void *data)
>struct drm_i915_private *dev_priv = dev->dev_private;
>struct intel_guc guc;
>struct i915_guc_client client = {};
> -  struct intel_engine_cs *ring;
>enum intel_ring_id i;
>u64 total = 0;
>
> @@ -2501,9 +2498,9 @@ static int i915_guc_info(struct seq_file *m, void *data)
>seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
>
>seq_printf(m, "\nGuC submissions:\n");
> -  for_each_ring(ring, dev_priv, i) {
> -  seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x %9d\n",
> -  ring->name, guc.submissions[i],
> +  for (i = GUC_RENDER_ENGINE; i < GUC_MAX_ENGINES_NUM; i++) {
> +  seq_printf(m, "\tengine %d: %10llu, last seqno 0x%08x %9d\n",
> +  i, guc.submissions[i],
>guc.last_seqno[i], guc.last_seqno[i]);
>total += guc.submissions[i];

For debugfs, would it not be more convenient to use the ring->name and
show the corresponding guc engine id?

> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 51ae5c1..601e2c8 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -365,6 +365,14 @@ static void guc_init_proc_desc(struct intel_guc *guc,
>kunmap_atomic(base);
>  }
>
> +static const enum intel_ring_id guc_engine_map[GUC_MAX_ENGINES_NUM] = {
> +  [GUC_RENDER_ENGINE] = RCS,
> +  [GUC_VIDEO_ENGINE] = VCS,
> +  [GUC_BLITTER_ENGINE] = BCS,
> +  [GUC_VIDEOENHANCE_ENGINE] = VECS,
> +  [GUC_VIDEO_ENGINE2] = VCS2
> +};
> +
>  /*
>   * Initialise/clear the context descriptor shared with the GuC firmware.
>   *
> @@ -388,9 +396,10 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
>desc.priority = client->priority;
>desc.db_id = client->doorbell_id;
>
> -  for (i = 0; i < I915_NUM_RINGS; i++) {
> +  for (i = GUC_RENDER_ENGINE; i < GUC_MAX_ENGINES_NUM; i++) {
>struct guc_execlist_context *lrc = [i];

Again, would it not be more consistent to iterate over engines
(for_each_ring) and use struct guc_execlist_context *lrc =
[ring->guc_id] ?

> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index b12f2aa..b69eadb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -158,6 +158,7 @@ struct  intel_engine_cs {
>  #define I915_NUM_RINGS 5
>  #define _VCS(n) (VCS + (n))
>unsigned int exec_id;
> +  unsigned int guc_engine_id;

Is the tautology useful? It is an engine, so intel_engine_cs->guc_id
would mean the correspondance map between our engines and the guc's.



Yes, guc_id is a good name. And I agree with you, using for_each_ring 
where possible will be more consistent with other diver code. Thanks for 
your review and comments.


Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Decouple GuC engine id from ring id

2016-01-22 Thread yu . dai
From: Alex Dai 

Previously GuC uses ring id as engine id because of same definition.
But this is not true since this commit:

commit de1add360522c876c25ef2ab1c94bdb509ab
Author: Tvrtko Ursulin 
Date:   Fri Jan 15 15:12:50 2016 +

drm/i915: Decouple execbuf uAPI from internal implementation

Added GuC engine id into GuC interface to decouple it from ring id used
by driver.

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c5db235..9a4e01e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2446,7 +2446,6 @@ static void i915_guc_client_info(struct seq_file *m,
 struct drm_i915_private *dev_priv,
 struct i915_guc_client *client)
 {
-   struct intel_engine_cs *ring;
uint64_t tot = 0;
uint32_t i;
 
@@ -2461,10 +2460,9 @@ static void i915_guc_client_info(struct seq_file *m,
seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
seq_printf(m, "\tLast submission result: %d\n", client->retcode);
 
-   for_each_ring(ring, dev_priv, i) {
-   seq_printf(m, "\tSubmissions: %llu %s\n",
-   client->submissions[i],
-   ring->name);
+   for (i = GUC_RENDER_ENGINE; i < GUC_MAX_ENGINES_NUM; i++) {
+   seq_printf(m, "\tSubmissions: %llu, engine %d\n",
+   client->submissions[i], i);
tot += client->submissions[i];
}
seq_printf(m, "\tTotal: %llu\n", tot);
@@ -2477,7 +2475,6 @@ static int i915_guc_info(struct seq_file *m, void *data)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc guc;
struct i915_guc_client client = {};
-   struct intel_engine_cs *ring;
enum intel_ring_id i;
u64 total = 0;
 
@@ -2501,9 +2498,9 @@ static int i915_guc_info(struct seq_file *m, void *data)
seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
 
seq_printf(m, "\nGuC submissions:\n");
-   for_each_ring(ring, dev_priv, i) {
-   seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x %9d\n",
-   ring->name, guc.submissions[i],
+   for (i = GUC_RENDER_ENGINE; i < GUC_MAX_ENGINES_NUM; i++) {
+   seq_printf(m, "\tengine %d: %10llu, last seqno 0x%08x %9d\n",
+   i, guc.submissions[i],
guc.last_seqno[i], guc.last_seqno[i]);
total += guc.submissions[i];
}
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 51ae5c1..601e2c8 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -365,6 +365,14 @@ static void guc_init_proc_desc(struct intel_guc *guc,
kunmap_atomic(base);
 }
 
+static const enum intel_ring_id guc_engine_map[GUC_MAX_ENGINES_NUM] = {
+   [GUC_RENDER_ENGINE] = RCS,
+   [GUC_VIDEO_ENGINE] = VCS,
+   [GUC_BLITTER_ENGINE] = BCS,
+   [GUC_VIDEOENHANCE_ENGINE] = VECS,
+   [GUC_VIDEO_ENGINE2] = VCS2
+};
+
 /*
  * Initialise/clear the context descriptor shared with the GuC firmware.
  *
@@ -388,9 +396,10 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
desc.priority = client->priority;
desc.db_id = client->doorbell_id;
 
-   for (i = 0; i < I915_NUM_RINGS; i++) {
+   for (i = GUC_RENDER_ENGINE; i < GUC_MAX_ENGINES_NUM; i++) {
struct guc_execlist_context *lrc = [i];
-   struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
+   enum intel_ring_id ring_id = guc_engine_map[i];
+   struct intel_ringbuffer *ringbuf = ctx->engine[ring_id].ringbuf;
struct intel_engine_cs *ring;
struct drm_i915_gem_object *obj;
uint64_t ctx_desc;
@@ -402,7 +411,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 * for now who owns a GuC client. But for future owner of GuC
 * client, need to make sure lrc is pinned prior to enter here.
 */
-   obj = ctx->engine[i].state;
+   obj = ctx->engine[ring_id].state;
if (!obj)
break;  /* XXX: continue? */
 
@@ -414,7 +423,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
lrc->ring_lcra = i915_gem_obj_ggtt_offset(obj) +
LRC_STATE_PN * PAGE_SIZE;
lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
-   (ring->id << GUC_ELC_ENGINE_OFFSET);
+   (i << GUC_ELC_ENGINE_OFFSET);
 
obj = ringbuf->obj;
 
@@ -423,7 +432,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 

Re: [Intel-gfx] [PATCH 1/2] Revert "FROM_UPSTREAM [VPG]: drm/i915/kbl: drm/i915: Avoid GuC loading for now on Kabylake."

2016-01-19 Thread Yu Dai



On 01/19/2016 01:25 PM, Daniel Vetter wrote:

On Tue, Jan 19, 2016 at 09:18:50PM +, Peter Antoine wrote:
> This reverts commit a92d3f32eafc57cca55e654ecfd916f283100365.

Shouldnt' this be patch 2/2? Enabling guc loading before it's fixed isn't
awesome.

Either way needs a proper commit message (why is it ok to enable now?)


This request was from Sean Kelley who is working on some Media features 
for KBL that relies on HuC firmware. HuC loading is done by GuC.


Thanks,
Alex


> ---
>  drivers/gpu/drm/i915/i915_drv.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index af30148..f99a988 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2635,8 +2635,8 @@ struct drm_i915_cmd_table {
>
>  #define HAS_CSR(dev)  (IS_GEN9(dev))
>
> -#define HAS_GUC_UCODE(dev)(IS_GEN9(dev) && !IS_KABYLAKE(dev))
> -#define HAS_GUC_SCHED(dev)(IS_GEN9(dev) && !IS_KABYLAKE(dev))
> +#define HAS_GUC_UCODE(dev)(IS_GEN9(dev))
> +#define HAS_GUC_SCHED(dev)(IS_GEN9(dev))
>
>  #define HAS_RESOURCE_STREAMER(dev) (IS_HASWELL(dev) || \
>INTEL_INFO(dev)->gen >= 8)
> --
> 1.9.1
>



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gen9: Correct max save/restore register count during gpu reset with GuC

2016-01-19 Thread Yu Dai

Thanks for capture the typo. LGTM.

Reviewed-by: Alex Dai 

On 01/18/2016 07:59 AM, Arun Siluvery wrote:

In GuC submission mode, driver has to provide a list of registers to be
save/restored during gpu reset, make the max no. of registers value consistent
with that of the value defined in FW. If they are not in sync then register
save/restore during gpu reset won't work as expected.

Cc: Alex Dai 
Cc: Dave Gordon 
Signed-off-by: Arun Siluvery 
---
  drivers/gpu/drm/i915/intel_guc_fwif.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 130d94c..1d8048b 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -370,7 +370,7 @@ struct guc_policies {
  #define GUC_REGSET_SAVE_DEFAULT_VALUE 0x8
  #define GUC_REGSET_SAVE_CURRENT_VALUE 0x10
  
-#define GUC_REGSET_MAX_REGISTERS	20

+#define GUC_REGSET_MAX_REGISTERS   25
  #define GUC_MMIO_WHITE_LIST_START 0x24d0
  #define GUC_MMIO_WHITE_LIST_MAX   12
  #define GUC_S3_SAVE_SPACE_PAGES   10


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] i915/guc: Add Kabylake GuC Loading

2016-01-19 Thread Yu Dai
I am OK with change here. However, in i915_drv.h, please check 
definition of HAS_GUC_UCODE() and HAS_GUC_SCHED(). I believe they are 
disabled for KBL.


Thanks,
Alex

On 01/18/2016 06:41 AM, Peter Antoine wrote:

This patch added the loading of the GuC for Kabylake.
It loads a 2.4 firmware.

Signed-off-by: Peter Antoine 
Signed-off-by: Michel Thierry 
---
  drivers/gpu/drm/i915/intel_guc_loader.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 3accd91..bbfa8f3 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -61,6 +61,8 @@
  
  #define I915_SKL_GUC_UCODE "i915/skl_guc_ver4.bin"

  MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
+#define I915_KBL_GUC_UCODE "i915/kbl_guc_ver2.bin"
+MODULE_FIRMWARE(I915_KBL_GUC_UCODE);
  
  /* User-friendly representation of an enum */

  const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
@@ -597,6 +599,10 @@ void intel_guc_ucode_init(struct drm_device *dev)
fw_path = I915_SKL_GUC_UCODE;
guc_fw->guc_fw_major_wanted = 4;
guc_fw->guc_fw_minor_wanted = 3;
+   } else if (IS_KABYLAKE(dev)) {
+   fw_path = I915_KBL_GUC_UCODE;
+   guc_fw->guc_fw_major_wanted = 2;
+   guc_fw->guc_fw_minor_wanted = 4;
} else {
i915.enable_guc_submission = false;
fw_path = ""; /* unknown device */


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915: resize the GuC WOPCM for rc6

2016-01-19 Thread Yu Dai



On 01/08/2016 07:03 AM, Peter Antoine wrote:

This patch resizes the GuC WOPCM to so that the GuC and the RC6 memory
spaces do not overlap.

Issue: https://jira01.devtools.intel.com/browse/VIZ-6638
Signed-off-by: Peter Antoine 
---
  drivers/gpu/drm/i915/i915_guc_reg.h | 3 ++-
  drivers/gpu/drm/i915/intel_guc_loader.c | 5 +
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index 685c799..cb938b0 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -58,7 +58,8 @@
  #define GUC_MAX_IDLE_COUNT_MMIO(0xC3E4)
  
  #define GUC_WOPCM_SIZE			_MMIO(0xc050)

-#define   GUC_WOPCM_SIZE_VALUE   (0x80 << 12)/* 512KB */
+#define   GUC_WOPCM_SIZE_VALUE (0x80 << 12)  /* 512KB */
+#define   BXT_GUC_WOPCM_SIZE_VALUE (0x70 << 12)  /* 448KB */
  
  /* GuC addresses below GUC_WOPCM_TOP don't map through the GTT */

  #define   GUC_WOPCM_TOP   (GUC_WOPCM_SIZE_VALUE)
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 8182d11..6b17d44 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -305,6 +305,11 @@ static int guc_ucode_xfer(struct drm_i915_private 
*dev_priv)
  
  	/* init WOPCM */

I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);


Just found a problem here. This line above needs to be deleted. This 
GUC_WOPCM_SIZE is written-once register. It will be locked after first 
write.


Thanks,
Alex

+   if (IS_BROXTON(dev))
+   I915_WRITE(GUC_WOPCM_SIZE, BXT_GUC_WOPCM_SIZE_VALUE);
+   else
+   I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
+
I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET_VALUE);
  
  	/* Enable MIA caching. GuC clock gating is disabled. */


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed

2016-01-13 Thread Yu Dai



On 01/13/2016 10:15 AM, Dave Gordon wrote:

On 12/01/16 23:17, yu@intel.com wrote:
> From: Alex Dai 
>
> During driver unloading, the guc_client created for command submission
> needs to be released to avoid memory leak.
>
> The struct_mutex needs to be held before tearing down GuC.
>
> v1: Move i915_guc_submission_disable out of i915_guc_submission_fini and
>  take struct_mutex lock before release GuC client. (Dave Gordon)

You don't seem to have implemented all the points I mentioned? I think
you want:

drivers/gpu/drm/i915/intel_guc_loader.c:
@@ -445,6 +445,7 @@ int intel_guc_ucode_load(struct drm_device *dev)

  direct_interrupts_to_host(dev_priv);
  i915_guc_submission_disable(dev);
+   i915_guc_submission_fini(dev);

Optional, but cleaner. We called i915_guc_submission_init() earlier in
this function, so we should call i915_guc_submission_fini() in the
failure path. That way, we either succeed, or leave the system state
unchanged, NOT leaving extra objects allocated.

  return err;
   }


I don't want this because struct_mutex is held by caller already while 
the fini() will acquire it too.

@@ -561,10 +562,12 @@ static void guc_fw_fetch(struct drm_device *dev,
struct intel_guc_fw *guc_fw)
DRM_ERROR("Failed to fetch GuC firmware from %s (error %d)\n",
  guc_fw->guc_fw_path, err);

+   mutex_lock(>struct_mutex);
obj = guc_fw->guc_fw_obj;
if (obj)
drm_gem_object_unreference(>base);
guc_fw->guc_fw_obj = NULL;
+   mutex_unlock(>struct_mutex);

This is the locking that needs to be added to the failure path.
This is required *in addition to* the locking reorganisation below.


I missed this part.

> Signed-off-by: Alex Dai 
>
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
> index d20788f..70fa8f5 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -631,10 +631,11 @@ void intel_guc_ucode_fini(struct drm_device *dev)
>struct drm_i915_private *dev_priv = dev->dev_private;
>struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
>
> +  mutex_lock(>struct_mutex);
>direct_interrupts_to_host(dev_priv);
> +  i915_guc_submission_disable(dev);
>i915_guc_submission_fini(dev);
>
> -  mutex_lock(>struct_mutex);
>if (guc_fw->guc_fw_obj)
>drm_gem_object_unreference(_fw->guc_fw_obj->base);
>guc_fw->guc_fw_obj = NULL;

This bit is fine, but incomplete without the other changes above.

.Dave.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback

2016-01-13 Thread Yu Dai
This version resolved the issue (kernel bug check in 
intel_lr_context_clean_ring) I reported on previous versions. Verified 
by igt drv_module_reload_basic, gem_close_race and -t basic tests.


Reviewed-by: Alex Dai 

On 01/13/2016 08:19 AM, Nick Hoath wrote:

Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been written back to by the GPU.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.
This fixes an issue with GuC submission where the GPU might not
have finished writing back the context before it is unpinned. This
results in a GPU hang.

v2: Moved the new pin to cover GuC submission (Alex Dai)
 Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
 context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request
v5: Only create a switch to idle context if the ring doesn't
 already have a request pending on it (Alex Dai)
 Rename unsaved to dirty to avoid double negatives (Dave Gordon)
 Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
 Split out per engine cleanup from context_free as it
 was getting unwieldy
 Corrected locking (Dave Gordon)
v6: Removed some bikeshedding (Mika Kuoppala)
 Added explanation of the GuC hang that this fixes (Daniel Vetter)
v7: Removed extra per request pinning from ring reset code (Alex Dai)
 Added forced ring unpin/clean in error case in context free (Alex Dai)
v8: Renamed lrc specific last_context to lrc_last_context as there
 were some reset cases where the codepaths leaked (Mika Kuoppala)
 NULL'd last_context in reset case - there was a pointer leak
 if someone did reset->close context.
v9: Rebase over "Fix context/engine cleanup order"
v10: Rebase over nightly, remove WARN_ON which caused the
 dependency on dev.

Signed-off-by: Nick Hoath 
Issue: VIZ-4277
Cc: Daniel Vetter 
Cc: David Gordon 
Cc: Chris Wilson 
Cc: Alex Dai 
Cc: Mika Kuoppala 
---
  drivers/gpu/drm/i915/i915_drv.h |   1 +
  drivers/gpu/drm/i915/i915_gem.c |   3 +
  drivers/gpu/drm/i915/intel_lrc.c| 138 ++--
  drivers/gpu/drm/i915/intel_lrc.h|   1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
  5 files changed, 121 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 104bd18..d28e10a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -882,6 +882,7 @@ struct intel_context {
struct {
struct drm_i915_gem_object *state;
struct intel_ringbuffer *ringbuf;
+   bool dirty;
int pin_count;
} engine[I915_NUM_RINGS];
  
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c

index ddc21d4..7b79405 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1413,6 +1413,9 @@ static void i915_gem_request_retire(struct 
drm_i915_gem_request *request)
  {
trace_i915_gem_request_retire(request);
  
+	if (i915.enable_execlists)

+   intel_lr_context_complete_check(request);
+
/* We know the GPU must have read the request to have
 * sent us the seqno + interrupt, so use the position
 * of tail of the request to update the last known position
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5027699..b661058 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -585,9 +585,6 @@ static int execlists_context_queue(struct 
drm_i915_gem_request *request)
struct drm_i915_gem_request *cursor;
int num_elements = 0;
  
-	if (request->ctx != ring->default_context)

-   intel_lr_context_pin(request);
-
i915_gem_request_reference(request);
  
  	spin_lock_irq(>execlist_lock);

@@ -763,6 +760,13 @@ intel_logical_ring_advance_and_submit(struct 
drm_i915_gem_request *request)
if (intel_ring_stopped(ring))
return;
  
+	if (request->ctx != ring->default_context) {

+   if (!request->ctx->engine[ring->id].dirty) {
+   intel_lr_context_pin(request);
+   request->ctx->engine[ring->id].dirty = true;
+   }
+   }
+
if (dev_priv->guc.execbuf_client)
i915_guc_submit(dev_priv->guc.execbuf_client, request);
else
@@ -989,12 +993,6 @@ void intel_execlists_retire_requests(struct 
intel_engine_cs *ring)
spin_unlock_irq(>execlist_lock);
  
  	

[Intel-gfx] [PATCH v2] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed

2016-01-13 Thread yu . dai
From: Alex Dai 

During driver unloading, the guc_client created for command submission
needs to be released to avoid memory leak.

The struct_mutex needs to be held before tearing down GuC.

v1: Move i915_guc_submission_disable out of i915_guc_submission_fini and
take struct_mutex lock before release GuC client. (Dave Gordon)
v2: Add the locking for failure case in guc_fw_fetch. (Dave Gordon)
Add i915_guc_submission_fini for failure case in intel_guc_ucode_load.

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index d20788f..3accd91 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -445,6 +445,7 @@ fail:
 
direct_interrupts_to_host(dev_priv);
i915_guc_submission_disable(dev);
+   i915_guc_submission_fini(dev);
 
return err;
 }
@@ -561,10 +562,12 @@ fail:
DRM_ERROR("Failed to fetch GuC firmware from %s (error %d)\n",
  guc_fw->guc_fw_path, err);
 
+   mutex_lock(>struct_mutex);
obj = guc_fw->guc_fw_obj;
if (obj)
drm_gem_object_unreference(>base);
guc_fw->guc_fw_obj = NULL;
+   mutex_unlock(>struct_mutex);
 
release_firmware(fw);   /* OK even if fw is NULL */
guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_FAIL;
@@ -631,10 +634,11 @@ void intel_guc_ucode_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
 
+   mutex_lock(>struct_mutex);
direct_interrupts_to_host(dev_priv);
+   i915_guc_submission_disable(dev);
i915_guc_submission_fini(dev);
 
-   mutex_lock(>struct_mutex);
if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed

2016-01-12 Thread Yu Dai



On 01/12/2016 04:11 AM, Dave Gordon wrote:

On 06/01/16 20:53, yu@intel.com wrote:
> From: Alex Dai 
>
> During driver unloading, the guc_client created for command submission
> needs to be released to avoid memory leak.
>
> Signed-off-by: Alex Dai 
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 9c24424..8ce4f32 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -995,6 +995,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
>struct drm_i915_private *dev_priv = dev->dev_private;
>struct intel_guc *guc = _priv->guc;
>
> +  if (i915.enable_guc_submission)
> +  i915_guc_submission_disable(dev);
> +
>gem_release_guc_obj(dev_priv->guc.ads_obj);
>guc->ads_obj = NULL;

This looks like the right thing to do, but the wrong place to do it.

i915_guc_submission_{init,enable,disable,fini} are the top-level
functions exported from this source file and called (only) from
intel_guc_loader.c

Therefore, the code in intel_guc_ucode_fini() should call
submission_disable() before submission_fini(), like this:

/**
   * intel_guc_ucode_fini() - clean up all allocated resources
   * @dev:drm device
   */
void intel_guc_ucode_fini(struct drm_device *dev)
{
  struct drm_i915_private *dev_priv = dev->dev_private;
  struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;

  direct_interrupts_to_host(dev_priv);
+   i915_guc_submission_disable(dev);
i915_guc_submission_fini(dev);

  mutex_lock(>struct_mutex);
  if (guc_fw->guc_fw_obj)
  drm_gem_object_unreference(_fw->guc_fw_obj->base);
  guc_fw->guc_fw_obj = NULL;
  mutex_unlock(>struct_mutex);

  guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
}

There's no need for it to be conditional, as disable (and fini) are
idempotent; if a thing hasn't been allocated, or has already been
deallocated, then these functions will just do nothing.


I agree. We will keep the symmetry here 
i915_guc_submission_init(_enable, _disable and _fini).

HOWEVER,

while reviewing this I've noticed that the locking is all screwed up;
basically "bf248ca drm/i915: Fix locking around GuC firmware load"
removed locking round the calls into i915_guc_loader.c and added it back
in a few places, but not enough.

It would probably have been better to have left the locking in the
caller, and hence round the entirety of the calls to _init, _load,
_fini, and then explicitly DROP the mutex only for the duration of the
request_firmware call.

It would have been better still not to insist on synchronous firmware
load in the first place; the original generic (and asynchronous) loader
didn't require struct_mutex or any other locking around the
request_firmware() call, so we wouldn't now have to fix it (again).

At present, in intel_guc_loader.c, intel_guc_ucode_load() is called with
the struct_mutex already held by the caller, but _init() and _fini() are
called with it NOT held.

All exported functions in i915_guc_submission.c expect it to be held
when they're called.

On that basis, what we need now is:

guc_fw_fetch() needs to take & release the mutex round the unreference
in the fail: path (like the code in _fini above).


I prefer the current approach that only takes lock for necessary 
critical session.

intel_guc_ucode_fini() needs to extend the scope of the lock to enclose
all calls to _submission_ functions. So the above becomes:

/**
* intel_guc_ucode_fini() - clean up all allocated resources
* @dev: drm device
*/
void intel_guc_ucode_fini(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;

mutex_lock(>struct_mutex);
direct_interrupts_to_host(dev_priv);
i915_guc_submission_disable(dev);
i915_guc_submission_fini(dev);

if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
mutex_unlock(>struct_mutex);

guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
}


This is done by patch https://patchwork.freedesktop.org/patch/68708/. 
Please review this one.

FINALLY,

intel_guc_ucode_load() should probably call i915_guc_submission_fini()
in the failure path (after submission_disable()) as it called
submission_init() earlier. Not critical, as it will get called from
ucode_fini() anyway, but it improves symmetry.




We don't have ucode_unload(). The ucode_fini() is actually doing the 
unload job. Because ucode_fini() needs to acquire the lock but 
ucode_load() expects that lock is held by caller, calling ucode_fini() 
inside ucode_load() is not good. I don't think it is worth to wrap up a 
ucode_unload() call which only 

[Intel-gfx] [PATCH v1] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed

2016-01-12 Thread yu . dai
From: Alex Dai 

During driver unloading, the guc_client created for command submission
needs to be released to avoid memory leak.

The struct_mutex needs to be held before tearing down GuC.

v1: Move i915_guc_submission_disable out of i915_guc_submission_fini and
take struct_mutex lock before release GuC client. (Dave Gordon)

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index d20788f..70fa8f5 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -631,10 +631,11 @@ void intel_guc_ucode_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
 
+   mutex_lock(>struct_mutex);
direct_interrupts_to_host(dev_priv);
+   i915_guc_submission_disable(dev);
i915_guc_submission_fini(dev);
 
-   mutex_lock(>struct_mutex);
if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/6] drm/i915/guc: Make the GuC fw loading helper functions general

2016-01-11 Thread yu . dai
From: Alex Dai 

Rename some of the GuC fw loading code to make them more general. We
will utilize them for HuC loading as well.
s/intel_guc_fw/intel_uc_fw/g
s/GUC_FIRMWARE/UC_FIRMWARE/g

Struct intel_guc_fw is renamed to intel_uc_fw. Prefix of tts members,
such as 'guc' or 'guc_fw' either is renamed to 'uc' or removed for
same purpose.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/i915_debugfs.c |  12 +--
 drivers/gpu/drm/i915/intel_guc.h|  39 +++
 drivers/gpu/drm/i915/intel_guc_loader.c | 181 +---
 3 files changed, 122 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index e3377ab..ec667f3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2410,7 +2410,7 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
 {
struct drm_info_node *node = m->private;
struct drm_i915_private *dev_priv = node->minor->dev->dev_private;
-   struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
+   struct intel_uc_fw *guc_fw = _priv->guc.guc_fw;
u32 tmp, i;
 
if (!HAS_GUC_UCODE(dev_priv->dev))
@@ -2418,15 +2418,15 @@ static int i915_guc_load_status_info(struct seq_file 
*m, void *data)
 
seq_printf(m, "GuC firmware status:\n");
seq_printf(m, "\tpath: %s\n",
-   guc_fw->guc_fw_path);
+   guc_fw->uc_fw_path);
seq_printf(m, "\tfetch: %s\n",
-   intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+   intel_uc_fw_status_repr(guc_fw->fetch_status));
seq_printf(m, "\tload: %s\n",
-   intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+   intel_uc_fw_status_repr(guc_fw->load_status));
seq_printf(m, "\tversion wanted: %d.%d\n",
-   guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+   guc_fw->major_ver_wanted, guc_fw->minor_ver_wanted);
seq_printf(m, "\tversion found: %d.%d\n",
-   guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found);
+   guc_fw->major_ver_found, guc_fw->minor_ver_found);
seq_printf(m, "\theader: offset is %d; size = %d\n",
guc_fw->header_offset, guc_fw->header_size);
seq_printf(m, "\tuCode: offset is %d; size = %d\n",
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 045b149..2324677 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -52,29 +52,29 @@ struct i915_guc_client {
int retcode;
 };
 
-enum intel_guc_fw_status {
-   GUC_FIRMWARE_FAIL = -1,
-   GUC_FIRMWARE_NONE = 0,
-   GUC_FIRMWARE_PENDING,
-   GUC_FIRMWARE_SUCCESS
+enum intel_uc_fw_status {
+   UC_FIRMWARE_FAIL = -1,
+   UC_FIRMWARE_NONE = 0,
+   UC_FIRMWARE_PENDING,
+   UC_FIRMWARE_SUCCESS
 };
 
 /*
  * This structure encapsulates all the data needed during the process
  * of fetching, caching, and loading the firmware image into the GuC.
  */
-struct intel_guc_fw {
-   struct drm_device * guc_dev;
-   const char *guc_fw_path;
-   size_t  guc_fw_size;
-   struct drm_i915_gem_object *guc_fw_obj;
-   enum intel_guc_fw_statusguc_fw_fetch_status;
-   enum intel_guc_fw_statusguc_fw_load_status;
-
-   uint16_tguc_fw_major_wanted;
-   uint16_tguc_fw_minor_wanted;
-   uint16_tguc_fw_major_found;
-   uint16_tguc_fw_minor_found;
+struct intel_uc_fw {
+   struct drm_device * uc_dev;
+   const char *uc_fw_path;
+   size_t  uc_fw_size;
+   struct drm_i915_gem_object *uc_fw_obj;
+   enum intel_uc_fw_status fetch_status;
+   enum intel_uc_fw_status load_status;
+
+   uint16_t major_ver_wanted;
+   uint16_t minor_ver_wanted;
+   uint16_t major_ver_found;
+   uint16_t minor_ver_found;
 
uint32_t header_size;
uint32_t header_offset;
@@ -85,7 +85,7 @@ struct intel_guc_fw {
 };
 
 struct intel_guc {
-   struct intel_guc_fw guc_fw;
+   struct intel_uc_fw guc_fw;
uint32_t log_flags;
struct drm_i915_gem_object *log_obj;
 
@@ -114,9 +114,10 @@ struct intel_guc {
 extern void intel_guc_ucode_init(struct drm_device *dev);
 extern int intel_guc_ucode_load(struct drm_device *dev);
 extern void intel_guc_ucode_fini(struct drm_device *dev);
-extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
+extern const char *intel_uc_fw_status_repr(enum intel_uc_fw_status status);
 extern int intel_guc_suspend(struct drm_device *dev);
 extern int intel_guc_resume(struct drm_device 

[Intel-gfx] [PATCH 4/6] drm/i915/huc: Add HuC fw loading support

2016-01-11 Thread yu . dai
From: Alex Dai 

The HuC loading process is similar to GuC. The intel_uc_fw_fetch()
is used for both cases.

HuC loading needs to be before GuC loading. The WOPCM setting must
be done early before loading any of them.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/Makefile   |   1 +
 drivers/gpu/drm/i915/i915_dma.c |   3 +
 drivers/gpu/drm/i915/i915_drv.h |   3 +
 drivers/gpu/drm/i915/i915_gem.c |   7 +
 drivers/gpu/drm/i915/i915_guc_reg.h |   3 +
 drivers/gpu/drm/i915/intel_guc_loader.c |   7 +-
 drivers/gpu/drm/i915/intel_huc.h|  44 ++
 drivers/gpu/drm/i915/intel_huc_loader.c | 262 
 8 files changed, 325 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_huc.h
 create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0851de07..693cc8f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -42,6 +42,7 @@ i915-y += i915_cmd_parser.o \
 
 # general-purpose microcontroller (GuC) support
 i915-y += intel_guc_loader.o \
+ intel_huc_loader.o \
  i915_guc_submission.o
 
 # autogenerated null render state
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 44a896c..1b99dd3 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -410,6 +410,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 * working irqs for e.g. gmbus and dp aux transfers. */
intel_modeset_init(dev);
 
+   intel_huc_ucode_init(dev);
intel_guc_ucode_init(dev);
 
ret = i915_gem_init(dev);
@@ -453,6 +454,7 @@ cleanup_gem:
i915_gem_context_fini(dev);
mutex_unlock(>struct_mutex);
 cleanup_irq:
+   intel_huc_ucode_fini(dev);
intel_guc_ucode_fini(dev);
drm_irq_uninstall(dev);
 cleanup_gem_stolen:
@@ -1194,6 +1196,7 @@ int i915_driver_unload(struct drm_device *dev)
/* Flush any outstanding unpin_work. */
flush_workqueue(dev_priv->wq);
 
+   intel_huc_ucode_fini(dev);
intel_guc_ucode_fini(dev);
mutex_lock(>struct_mutex);
i915_gem_cleanup_ringbuffer(dev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 747d2d8..15e9e59 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include "intel_guc.h"
+#include "intel_huc.h"
 
 /* General customization:
  */
@@ -1699,6 +1700,7 @@ struct drm_i915_private {
 
struct i915_virtual_gpu vgpu;
 
+   struct intel_huc huc;
struct intel_guc guc;
 
struct intel_csr csr;
@@ -2629,6 +2631,7 @@ struct drm_i915_cmd_table {
 
 #define HAS_GUC_UCODE(dev) (IS_GEN9(dev) && !IS_KABYLAKE(dev))
 #define HAS_GUC_SCHED(dev) (IS_GEN9(dev) && !IS_KABYLAKE(dev))
+#define HAS_HUC_UCODE(dev) (IS_GEN9(dev) && !IS_KABYLAKE(dev))
 
 #define HAS_RESOURCE_STREAMER(dev) (IS_HASWELL(dev) || \
INTEL_INFO(dev)->gen >= 8)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6c60e04..75de2eb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4849,6 +4849,13 @@ i915_gem_init_hw(struct drm_device *dev)
 
/* We can't enable contexts until all firmware is loaded */
if (HAS_GUC_UCODE(dev)) {
+   /* init WOPCM */
+   I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
+   I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET_VALUE |
+   HUC_LOADING_AGENT_GUC);
+
+   intel_huc_ucode_load(dev);
+
ret = intel_guc_ucode_load(dev);
if (ret) {
DRM_ERROR("Failed to initialize GuC, error %d\n", ret);
diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index e4ba582..8d27c09 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -52,9 +52,12 @@
 #define   DMA_ADDRESS_SPACE_GTT  (8 << 16)
 #define DMA_COPY_SIZE  _MMIO(0xc310)
 #define DMA_CTRL   _MMIO(0xc314)
+#define   HUC_UKERNEL(1<<9)
 #define   UOS_MOVE   (1<<4)
 #define   START_DMA  (1<<0)
 #define DMA_GUC_WOPCM_OFFSET   _MMIO(0xc340)
+#define   HUC_LOADING_AGENT_VCR  (0<<1)
+#define   HUC_LOADING_AGENT_GUC  (1<<1)
 #define   GUC_WOPCM_OFFSET_VALUE 0x8   /* 512KB */
 #define GUC_MAX_IDLE_COUNT _MMIO(0xC3E4)
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index a704d80..5832792 100644
--- 

[Intel-gfx] [PATCH 2/6] drm/i915/guc: Bypass fw loading gracefully if GuC is not supported

2016-01-11 Thread yu . dai
From: Alex Dai 

This is to rework previous patch:

commit 9f9e539f90bcecfdc7b3679d337b7a62d4313205
Author: Daniel Vetter 
Date:   Fri Oct 23 11:10:59 2015 +0200

drm/i915: Shut up GuC errors when it's disabled

There is the case where GuC loading is needed even GuC submission
is disabled. For example, HuC loading and authentication require
GuC to be loaded regardless. In this patch, driver will try to load
the firmware only when it explicitly asks for that by specifying fw
name and version. All other cases are considered as UC_FIRMWARE_NONE
and the loading is bypassed silently.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 70dbeb5..e11e1e8 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -594,39 +594,29 @@ void intel_guc_ucode_init(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_uc_fw *guc_fw = _priv->guc.guc_fw;
-   const char *fw_path;
+   const char *fw_path = NULL;
+
+   guc_fw->uc_dev = dev;
+   guc_fw->uc_fw_path = NULL;
+   guc_fw->fetch_status = UC_FIRMWARE_NONE;
+   guc_fw->load_status = UC_FIRMWARE_NONE;
 
if (!HAS_GUC_SCHED(dev))
i915.enable_guc_submission = false;
 
-   if (!HAS_GUC_UCODE(dev)) {
-   fw_path = NULL;
-   } else if (IS_SKYLAKE(dev)) {
+   if (!HAS_GUC_UCODE(dev))
+   return;
+
+   if (IS_SKYLAKE(dev)) {
fw_path = I915_SKL_GUC_UCODE;
guc_fw->major_ver_wanted = 4;
guc_fw->minor_ver_wanted = 3;
-   } else {
-   i915.enable_guc_submission = false;
-   fw_path = "";   /* unknown device */
}
 
-   if (!i915.enable_guc_submission)
-   return;
-
-   guc_fw->uc_dev = dev;
-   guc_fw->uc_fw_path = fw_path;
-   guc_fw->fetch_status = UC_FIRMWARE_NONE;
-   guc_fw->load_status = UC_FIRMWARE_NONE;
-
if (fw_path == NULL)
return;
 
-   if (*fw_path == '\0') {
-   DRM_ERROR("No GuC firmware known for this platform\n");
-   guc_fw->fetch_status = UC_FIRMWARE_FAIL;
-   return;
-   }
-
+   guc_fw->uc_fw_path = fw_path;
guc_fw->fetch_status = UC_FIRMWARE_PENDING;
DRM_DEBUG_DRIVER("GuC firmware pending, path %s\n", fw_path);
intel_uc_fw_fetch(dev, guc_fw);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/6] drm/i915/huc: Unified css_header struct for GuC and HuC

2016-01-11 Thread yu . dai
From: Alex Dai 

HuC firmware css header has almost exactly same definition as GuC
firmware except for the sw_version. Also, add a new member fw_type
into intel_uc_fw to indicate what kind of fw it is. So, the loader
will pull right sw_version from header.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/intel_guc.h|  4 
 drivers/gpu/drm/i915/intel_guc_fwif.h   | 16 ++---
 drivers/gpu/drm/i915/intel_guc_loader.c | 42 +
 3 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 2324677..45f4fd3 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -59,6 +59,9 @@ enum intel_uc_fw_status {
UC_FIRMWARE_SUCCESS
 };
 
+#define UC_FW_TYPE_GUC 0
+#define UC_FW_TYPE_HUC 1
+
 /*
  * This structure encapsulates all the data needed during the process
  * of fetching, caching, and loading the firmware image into the GuC.
@@ -76,6 +79,7 @@ struct intel_uc_fw {
uint16_t major_ver_found;
uint16_t minor_ver_found;
 
+   uint32_t fw_type;
uint32_t header_size;
uint32_t header_offset;
uint32_t rsa_size;
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index b4632f0..f8846d6 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -146,7 +146,7 @@
  * The GuC firmware layout looks like this:
  *
  * +---+
- * |guc_css_header |
+ * | uc_css_header |
  * | contains major/minor version  |
  * +---+
  * | uCode |
@@ -172,9 +172,16 @@
  * 3. Length info of each component can be found in header, in dwords.
  * 4. Modulus and exponent key are not required by driver. They may not appear
  * in fw. So driver will load a truncated firmware in this case.
+ *
+ * HuC firmware layout is same as GuC firmware.
+ *
+ * HuC firmware css header is different. However, the only difference is where
+ * the version information is saved. The uc_css_header is unified to support
+ * both. Driver should get HuC version from uc_css_header.huc_sw_version, while
+ * uc_css_header.guc_sw_version for GuC.
  */
 
-struct guc_css_header {
+struct uc_css_header {
uint32_t module_type;
/* header_size includes all non-uCode bits, including css_header, rsa
 * key, modulus key and exponent data. */
@@ -205,7 +212,10 @@ struct guc_css_header {
 
char username[8];
char buildnumber[12];
-   uint32_t device_id;
+   union {
+   uint32_t device_id;
+   uint32_t huc_sw_version;
+   };
uint32_t guc_sw_version;
uint32_t prod_preprod_fw;
uint32_t reserved[12];
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index e11e1e8..a704d80 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -464,7 +464,7 @@ void intel_uc_fw_fetch(struct drm_device *dev, struct 
intel_uc_fw *uc_fw)
 {
struct drm_i915_gem_object *obj;
const struct firmware *fw;
-   struct guc_css_header *css;
+   struct uc_css_header *css;
size_t size;
int err;
 
@@ -481,19 +481,19 @@ void intel_uc_fw_fetch(struct drm_device *dev, struct 
intel_uc_fw *uc_fw)
uc_fw->uc_fw_path, fw);
 
/* Check the size of the blob before examining buffer contents */
-   if (fw->size < sizeof(struct guc_css_header)) {
+   if (fw->size < sizeof(struct uc_css_header)) {
DRM_ERROR("Firmware header is missing\n");
goto fail;
}
 
-   css = (struct guc_css_header *)fw->data;
+   css = (struct uc_css_header *)fw->data;
 
/* Firmware bits always start from header */
uc_fw->header_offset = 0;
uc_fw->header_size = (css->header_size_dw - css->modulus_size_dw -
css->key_size_dw - css->exponent_size_dw) * sizeof(u32);
 
-   if (uc_fw->header_size != sizeof(struct guc_css_header)) {
+   if (uc_fw->header_size != sizeof(struct uc_css_header)) {
DRM_ERROR("CSS header definition mismatch\n");
goto fail;
}
@@ -517,23 +517,35 @@ void intel_uc_fw_fetch(struct drm_device *dev, struct 
intel_uc_fw *uc_fw)
goto fail;
}
 
-   /* Header and uCode will be loaded to WOPCM. Size of the two. */
-   size = uc_fw->header_size + uc_fw->ucode_size;
-
-   /* Top 32k of WOPCM is reserved (8K stack + 24k RC6 context). */
-   if (size > GUC_WOPCM_SIZE_VALUE - 0x8000) {
-   DRM_ERROR("Firmware is too large to fit in WOPCM\n");
-   goto fail;
-   }
-
/*
 

[Intel-gfx] [PATCH 5/6] drm/i915/huc: Add debugfs for HuC loading status check

2016-01-11 Thread yu . dai
From: Alex Dai 

Add debugfs entry for HuC loading status check.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index ec667f3..7676f56 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2406,6 +2406,37 @@ static int i915_llc(struct seq_file *m, void *data)
return 0;
 }
 
+static int i915_huc_load_status_info(struct seq_file *m, void *data)
+{
+   struct drm_info_node *node = m->private;
+   struct drm_i915_private *dev_priv = node->minor->dev->dev_private;
+   struct intel_uc_fw *huc_fw = _priv->huc.huc_fw;
+
+   if (!HAS_HUC_UCODE(dev_priv->dev))
+   return 0;
+
+   seq_printf(m, "HuC firmware status:\n");
+   seq_printf(m, "\tpath: %s\n", huc_fw->uc_fw_path);
+   seq_printf(m, "\tfetch: %s\n",
+   intel_uc_fw_status_repr(huc_fw->fetch_status));
+   seq_printf(m, "\tload: %s\n",
+   intel_uc_fw_status_repr(huc_fw->load_status));
+   seq_printf(m, "\tversion wanted: %d.%d\n",
+   huc_fw->major_ver_wanted, huc_fw->minor_ver_wanted);
+   seq_printf(m, "\tversion found: %d.%d\n",
+   huc_fw->major_ver_found, huc_fw->minor_ver_found);
+   seq_printf(m, "\theader: offset is %d; size = %d\n",
+   huc_fw->header_offset, huc_fw->header_size);
+   seq_printf(m, "\tuCode: offset is %d; size = %d\n",
+   huc_fw->ucode_offset, huc_fw->ucode_size);
+   seq_printf(m, "\tRSA: offset is %d; size = %d\n",
+   huc_fw->rsa_offset, huc_fw->rsa_size);
+
+   seq_printf(m, "\nHuC status 0x%08x:\n", I915_READ(HUC_STATUS2));
+
+   return 0;
+}
+
 static int i915_guc_load_status_info(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
@@ -5346,6 +5377,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_guc_info", i915_guc_info, 0},
{"i915_guc_load_status", i915_guc_load_status_info, 0},
{"i915_guc_log_dump", i915_guc_log_dump, 0},
+   {"i915_huc_load_status", i915_huc_load_status_info, 0},
{"i915_frequency_info", i915_frequency_info, 0},
{"i915_hangcheck_info", i915_hangcheck_info, 0},
{"i915_drpc_info", i915_drpc_info, 0},
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 6/6] drm/i915/huc: Support HuC authentication

2016-01-11 Thread yu . dai
From: Alex Dai 

The HuC authentication is done by host2guc call. The HuC RSA keys
are sent to GuC for authentication.

Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 65 ++
 drivers/gpu/drm/i915/intel_guc_fwif.h  |  1 +
 drivers/gpu/drm/i915/intel_guc_loader.c|  2 +
 3 files changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8ce4f32..096b524 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -25,6 +25,7 @@
 #include 
 #include "i915_drv.h"
 #include "intel_guc.h"
+#include "intel_huc.h"
 
 /**
  * DOC: GuC-based command submission
@@ -1059,3 +1060,67 @@ int intel_guc_resume(struct drm_device *dev)
 
return host2guc_action(guc, data, ARRAY_SIZE(data));
 }
+
+/**
+ * intel_huc_ucode_auth() - authenticate ucode
+ * @dev: the drm device
+ *
+ * Triggers a HuC fw authentication request to the GuC via host-2-guc
+ * interface.
+ */
+void intel_huc_ucode_auth(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct intel_guc *guc = _priv->guc;
+   struct intel_huc *huc = _priv->huc;
+   int ret;
+   u32 data[2];
+
+   /* Bypass the case where there is no HuC firmware */
+   if (huc->huc_fw.fetch_status == UC_FIRMWARE_NONE ||
+   huc->huc_fw.load_status == UC_FIRMWARE_NONE)
+   return;
+
+   if (guc->guc_fw.load_status != UC_FIRMWARE_SUCCESS) {
+   DRM_ERROR("HuC: GuC fw wasn't loaded. Can't authenticate");
+   return;
+   }
+
+   if (huc->huc_fw.load_status != UC_FIRMWARE_SUCCESS) {
+   DRM_ERROR("HuC: fw wasn't loaded. Nothing to authenticate");
+   return;
+   }
+
+   ret = i915_gem_obj_ggtt_pin(huc->huc_fw.uc_fw_obj, 0, 0);
+   if (ret) {
+   DRM_ERROR("HuC: Pin failed");
+   return;
+   }
+
+   /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
+   I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
+   /* Specify auth action and where public signature is. It's stored
+* at the beginning of the gem object, before the fw bits
+*/
+   data[0] = HOST2GUC_ACTION_AUTHENTICATE_HUC;
+   data[1] = i915_gem_obj_ggtt_offset(huc->huc_fw.uc_fw_obj) +
+   huc->huc_fw.rsa_offset;
+
+   ret = host2guc_action(guc, data, ARRAY_SIZE(data));
+   if (ret) {
+   DRM_ERROR("HuC: GuC did not ack Auth request\n");
+   goto out;
+   }
+
+   /* Check authentication status, it should be done by now */
+   ret = wait_for_atomic(
+   (I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED) > 0, 5000);
+   if (ret) {
+   DRM_ERROR("HuC: Authentication failed\n");
+   goto out;
+   }
+
+out:
+   i915_gem_object_ggtt_unpin(huc->huc_fw.uc_fw_obj);
+}
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index f8846d6..2974e33 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -429,6 +429,7 @@ enum host2guc_action {
HOST2GUC_ACTION_ENTER_S_STATE = 0x501,
HOST2GUC_ACTION_EXIT_S_STATE = 0x502,
HOST2GUC_ACTION_SLPC_REQUEST = 0x3003,
+   HOST2GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
HOST2GUC_ACTION_LIMIT
 };
 
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 5832792..45b9c43 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -424,6 +424,8 @@ int intel_guc_ucode_load(struct drm_device *dev)
intel_uc_fw_status_repr(guc_fw->fetch_status),
intel_uc_fw_status_repr(guc_fw->load_status));
 
+   intel_huc_ucode_auth(dev);
+
if (i915.enable_guc_submission) {
/* The execbuf_client will be recreated. Release it first. */
i915_guc_submission_disable(dev);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 0/6] Support HuC loading and authentication

2016-01-11 Thread yu . dai
From: Alex Dai 

This series of patches is to enable HuC firmware loading and authentication.
The GuC loader and css_header are unified for HuC loading.

Alex Dai (6):
  drm/i915/guc: Make the GuC fw loading helper functions general
  drm/i915/guc: Bypass fw loading gracefully if GuC is not supported
  drm/i915/huc: Unified css_header struct for GuC and HuC
  drm/i915/huc: Add HuC fw loading support
  drm/i915/huc: Add debugfs for HuC loading status check
  drm/i915/huc: Support HuC authentication

 drivers/gpu/drm/i915/Makefile  |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c|  44 -
 drivers/gpu/drm/i915/i915_dma.c|   3 +
 drivers/gpu/drm/i915/i915_drv.h|   3 +
 drivers/gpu/drm/i915/i915_gem.c|   7 +
 drivers/gpu/drm/i915/i915_guc_reg.h|   3 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  65 +++
 drivers/gpu/drm/i915/intel_guc.h   |  45 ++---
 drivers/gpu/drm/i915/intel_guc_fwif.h  |  17 +-
 drivers/gpu/drm/i915/intel_guc_loader.c| 246 ++-
 drivers/gpu/drm/i915/intel_huc.h   |  44 +
 drivers/gpu/drm/i915/intel_huc_loader.c| 262 +
 12 files changed, 594 insertions(+), 146 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_huc.h
 create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c

-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm/i915: Adding Broxton GuC Loader Support

2016-01-11 Thread Yu Dai

Reviewed-by: Alex Dai 

On 01/08/2016 07:03 AM, Peter Antoine wrote:

This commits adds the Broxton target to the GuC loader

Issue: https://jira01.devtools.intel.com/browse/VIZ-6638
Signed-off-by: Peter Antoine 
---
  drivers/gpu/drm/i915/intel_guc_loader.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 550921f..8182d11 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -62,6 +62,9 @@
  #define I915_SKL_GUC_UCODE "i915/skl_guc_ver4.bin"
  MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
  
+#define I915_BXT_GUC_UCODE "i915/bxt_guc_ver3.bin"

+MODULE_FIRMWARE(I915_BXT_GUC_UCODE);
+
  /* User-friendly representation of an enum */
  const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
  {
@@ -587,6 +590,10 @@ void intel_guc_ucode_init(struct drm_device *dev)
fw_path = I915_SKL_GUC_UCODE;
guc_fw->guc_fw_major_wanted = 4;
guc_fw->guc_fw_minor_wanted = 3;
+   } else if (IS_BROXTON(dev)) {
+   fw_path = I915_BXT_GUC_UCODE;
+   guc_fw->guc_fw_major_wanted = 3;
+   guc_fw->guc_fw_minor_wanted = 0;
} else {
i915.enable_guc_submission = false;
fw_path = ""; /* unknown device */


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915: resize the GuC WOPCM for rc6

2016-01-11 Thread Yu Dai

Reviewed-by: Alex Dai 

On 01/08/2016 07:03 AM, Peter Antoine wrote:

This patch resizes the GuC WOPCM to so that the GuC and the RC6 memory
spaces do not overlap.

Issue: https://jira01.devtools.intel.com/browse/VIZ-6638
Signed-off-by: Peter Antoine 
---
  drivers/gpu/drm/i915/i915_guc_reg.h | 3 ++-
  drivers/gpu/drm/i915/intel_guc_loader.c | 5 +
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index 685c799..cb938b0 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -58,7 +58,8 @@
  #define GUC_MAX_IDLE_COUNT_MMIO(0xC3E4)
  
  #define GUC_WOPCM_SIZE			_MMIO(0xc050)

-#define   GUC_WOPCM_SIZE_VALUE   (0x80 << 12)/* 512KB */
+#define   GUC_WOPCM_SIZE_VALUE (0x80 << 12)  /* 512KB */
+#define   BXT_GUC_WOPCM_SIZE_VALUE (0x70 << 12)  /* 448KB */
  
  /* GuC addresses below GUC_WOPCM_TOP don't map through the GTT */

  #define   GUC_WOPCM_TOP   (GUC_WOPCM_SIZE_VALUE)
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 8182d11..6b17d44 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -305,6 +305,11 @@ static int guc_ucode_xfer(struct drm_i915_private 
*dev_priv)
  
  	/* init WOPCM */

I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
+   if (IS_BROXTON(dev))
+   I915_WRITE(GUC_WOPCM_SIZE, BXT_GUC_WOPCM_SIZE_VALUE);
+   else
+   I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
+
I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET_VALUE);
  
  	/* Enable MIA caching. GuC clock gating is disabled. */


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/3] drm/i915: Wait after context init with GuC Submission

2016-01-11 Thread Yu Dai

Reviewed-by: Alex Dai 

On 01/08/2016 07:03 AM, Peter Antoine wrote:

Per-context initialisation GPU instructions (which are injected directly
into the ringbuffer rather than being submitted as a batch) should not
be allowed to mix with user-generated batches in the same submission; it
will cause confusion for the GuC (which might merge a subsequent
preemptive request with the non-preemptive initialisation code), and for
the scheduler, which wouldn't know how to re-inject a non-batch request
if it were the victim of preemption.

Therefore, we should wait for the initialisation request to complete
before making the newly-initialised context available for user-mode
submissions.

Here, we add a call to i915_wait_request() after each existing call to
i915_add_request_no_flush() (in i915_gem_init_hw(), for the default
per-engine contexts, and intel_lr_context_deferred_create(), for all
others).

Adapted from Dave Gordon's patch, which is adapted from Alex's earlier
patch, which added the wait only to intel_lr_context_render_state_init().

Issue: https://jira01.devtools.intel.com/browse/VIZ-6638
Signed-off-by: Dave Gordon 
Signed-off-by: Peter Antoine 
---
  drivers/gpu/drm/i915/i915_gem.c  | 10 ++
  drivers/gpu/drm/i915/intel_lrc.c | 11 +++
  2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5be4433..e71bf90 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4847,6 +4847,16 @@ i915_gem_init_hw(struct drm_device *dev)
}
  
  		i915_add_request_no_flush(req);

+
+   /*
+* GuC firmware will try to collapse its DPC work queue if the
+* new one is for same context. So the following breadcrumb
+* could be amended to this batch and submitted as one batch.
+* Wait here to make sure the context state init is finished
+* before any other submission to GuC.
+*/
+   if (i915.enable_guc_submission)
+   ret = i915_wait_request(req);
}
  
  out:

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3aa6147..f18fb11 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2499,6 +2499,17 @@ int intel_lr_context_deferred_alloc(struct intel_context 
*ctx,
goto error_ringbuf;
}
i915_add_request_no_flush(req);
+
+   /*
+* GuC firmware will try to collapse its DPC work queue
+* if the new one is for same context. So the
+* following breadcrumb could be amended to this batch
+* and submitted as one batch. Wait here to make sure
+* the context state init is finished before any other
+* submission to GuC.
+*/
+   if (i915.enable_guc_submission)
+   ret = i915_wait_request(req);
}
return 0;
  


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t] tests/gem_guc_loading: Adding simple GuC loading test

2016-01-07 Thread Yu Dai

This has been reviewed internally. LGTM.

Reviewed-by: Alex Dai 

On 01/05/2016 08:17 AM, Lukasz Fiedorowicz wrote:

Test check GuC debugfs file for successful loading confirmation

Signed-off-by: Lukasz Fiedorowicz 
---
  tests/Makefile.sources  |  1 +
  tests/gem_guc_loading.c | 89 +
  2 files changed, 90 insertions(+)
  create mode 100644 tests/gem_guc_loading.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index d594038..331234f 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -36,6 +36,7 @@ TESTS_progs_M = \
gem_flink_basic \
gem_flink_race \
gem_linear_blits \
+   gem_guc_loading \
gem_madvise \
gem_mmap \
gem_mmap_gtt \
diff --git a/tests/gem_guc_loading.c b/tests/gem_guc_loading.c
new file mode 100644
index 000..fd53a46
--- /dev/null
+++ b/tests/gem_guc_loading.c
@@ -0,0 +1,89 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Lukasz Fiedorowicz 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "igt.h"
+
+IGT_TEST_DESCRIPTION("GuC firmware loading test.");
+
+#define LOAD_STATUS_BUF_SIZE 96
+
+enum guc_status { GUC_ENABLED, GUC_DISABLED };
+
+int guc_status_fd;
+
+static void open_guc_status(void)
+{
+   guc_status_fd = igt_debugfs_open("i915_guc_load_status", O_RDONLY);
+   igt_assert_f(guc_status_fd >= 0, "Can't open i915_guc_load_status\n");
+}
+
+static enum guc_status get_guc_status(void)
+{
+   char buf[LOAD_STATUS_BUF_SIZE];
+
+   FILE *fp = fdopen(guc_status_fd, "r");
+   igt_assert_f(fp != NULL, "Can't open i915_guc_load_status file\n");
+
+   while (fgets(buf, LOAD_STATUS_BUF_SIZE, fp))
+   if ((strstr(buf, "\tload: SUCCESS\n")))
+   return GUC_ENABLED;
+
+   return GUC_DISABLED;
+}
+
+static void close_guc_status(void)
+{
+   close(guc_status_fd);
+}
+
+static void test_guc_loaded()
+{
+   igt_assert_f(get_guc_status() == GUC_ENABLED, "GuC is disabled\n");
+}
+
+igt_main
+{
+   int gfx_fd = 0;
+   int gen = 0;
+
+   igt_fixture
+   {
+   gfx_fd = drm_open_driver(DRIVER_INTEL);
+   gen = intel_gen(intel_get_drm_devid(gfx_fd));
+   igt_require(gen >= 9);
+   open_guc_status();
+   }
+
+   igt_subtest("guc_loaded") test_guc_loaded();
+
+   igt_fixture close_guc_status();
+}


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed

2016-01-06 Thread yu . dai
From: Alex Dai 

During driver unloading, the guc_client created for command submission
needs to be released to avoid memory leak.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 9c24424..8ce4f32 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -995,6 +995,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc *guc = _priv->guc;
 
+   if (i915.enable_guc_submission)
+   i915_guc_submission_disable(dev);
+
gem_release_guc_obj(dev_priv->guc.ads_obj);
guc->ads_obj = NULL;
 
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3] drm/i915/guc: Expose (intel)_lr_context_size()

2016-01-05 Thread yu . dai
From: Dave Gordon 

The GuC code needs to know the size of a logical context, so we
expose get_lr_context_size(), renaming it intel_lr_context__size()
to fit the naming conventions for nonstatic functions.

Add comments or kerneldoc (Daniel Vetter)

For: VIZ-2021
Signed-off-by: Dave Gordon 
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_lrc.c | 13 +++--
 drivers/gpu/drm/i915/intel_lrc.h |  1 +
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e095058..9be9835 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2399,7 +2399,16 @@ void intel_lr_context_free(struct intel_context *ctx)
}
 }
 
-static uint32_t get_lr_context_size(struct intel_engine_cs *ring)
+/**
+ * intel_lr_context_size() - get the LRC state pages size
+ * @ring: engine to be used to get ring id
+ *
+ * The LRC state pages size varies for different engines. This function is used
+ * in ExecList / GuC mode to get LRC state pages size.
+ *
+ * Return: size of the LRC state pages. zero on unknown engine.
+ */
+uint32_t intel_lr_context_size(struct intel_engine_cs *ring)
 {
int ret = 0;
 
@@ -2467,7 +2476,7 @@ int intel_lr_context_deferred_alloc(struct intel_context 
*ctx,
WARN_ON(ctx->legacy_hw_ctx.rcs_state != NULL);
WARN_ON(ctx->engine[ring->id].state);
 
-   context_size = round_up(get_lr_context_size(ring), 4096);
+   context_size = round_up(intel_lr_context_size(ring), 4096);
 
/* One extra page as the sharing data between driver and GuC */
context_size += PAGE_SIZE * LRC_PPHWSP_PN;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 0b821b9..ae90f86 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -84,6 +84,7 @@ static inline void intel_logical_ring_emit_reg(struct 
intel_ringbuffer *ringbuf,
 #define LRC_STATE_PN   (LRC_PPHWSP_PN + 1)
 
 void intel_lr_context_free(struct intel_context *ctx);
+uint32_t intel_lr_context_size(struct intel_engine_cs *ring);
 int intel_lr_context_deferred_alloc(struct intel_context *ctx,
struct intel_engine_cs *ring);
 void intel_lr_context_unpin(struct drm_i915_gem_request *req);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Enable GuC submission, where supported

2016-01-05 Thread yu . dai
From: Alex Dai 

This is to enable command submission via GuC.

Signed-off-by: Dave Gordon 
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_params.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 8d90c25..2ca4690 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -55,7 +55,7 @@ struct i915_params i915 __read_mostly = {
.verbose_state_checks = 1,
.nuclear_pageflip = 0,
.edp_vswing = 0,
-   .enable_guc_submission = false,
+   .enable_guc_submission = true,
.guc_log_level = -1,
 };
 
@@ -198,7 +198,7 @@ MODULE_PARM_DESC(edp_vswing,
 "2=default swing(400mV))");
 
 module_param_named_unsafe(enable_guc_submission, i915.enable_guc_submission, 
bool, 0400);
-MODULE_PARM_DESC(enable_guc_submission, "Enable GuC submission 
(default:false)");
+MODULE_PARM_DESC(enable_guc_submission, "Enable GuC submission 
(default:true)");
 
 module_param_named(guc_log_level, i915.guc_log_level, int, 0400);
 MODULE_PARM_DESC(guc_log_level,
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 1/5] drm/i915/guc: Expose (intel)_lr_context_size()

2016-01-05 Thread Yu Dai



On 01/05/2016 02:27 AM, Daniel Vetter wrote:

On Fri, Dec 18, 2015 at 12:00:08PM -0800, yu@intel.com wrote:
> From: Dave Gordon 
>
> The GuC code needs to know the size of a logical context, so we
> expose get_lr_context_size(), renaming it intel_lr_context__size()
> to fit the naming conventions for nonstatic functions.
>
> For: VIZ-2021
> Signed-off-by: Dave Gordon 
> Signed-off-by: Alex Dai 
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
b/drivers/gpu/drm/i915/intel_lrc.c
> index e5fb8ea..7a6b896 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -2399,7 +2399,7 @@ void intel_lr_context_free(struct intel_context *ctx)
>}
>  }
>
> -static uint32_t get_lr_context_size(struct intel_engine_cs *ring)
> +uint32_t intel_lr_context_size(struct intel_engine_cs *ring)

As a rule of thumb, non-static functions should have kerneldoc within
drm/i915. At least in the files where we bothered with kerneldoc already.
Please do a follow-up patch to remedy this.


Thanks for the review. I submitted v3 to add some comments for 
kerneldoc. The other patches of this series are not touched.


Thanks,
Alex

>  {
>int ret = 0;
>
> @@ -2467,7 +2467,7 @@ int intel_lr_context_deferred_alloc(struct 
intel_context *ctx,
>WARN_ON(ctx->legacy_hw_ctx.rcs_state != NULL);
>WARN_ON(ctx->engine[ring->id].state);
>
> -  context_size = round_up(get_lr_context_size(ring), 4096);
> +  context_size = round_up(intel_lr_context_size(ring), 4096);
>
>/* One extra page as the sharing data between driver and GuC */
>context_size += PAGE_SIZE * LRC_PPHWSP_PN;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h 
b/drivers/gpu/drm/i915/intel_lrc.h
> index 0b821b9..ae90f86 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -84,6 +84,7 @@ static inline void intel_logical_ring_emit_reg(struct 
intel_ringbuffer *ringbuf,
>  #define LRC_STATE_PN  (LRC_PPHWSP_PN + 1)
>
>  void intel_lr_context_free(struct intel_context *ctx);
> +uint32_t intel_lr_context_size(struct intel_engine_cs *ring);
>  int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>struct intel_engine_cs *ring);
>  void intel_lr_context_unpin(struct drm_i915_gem_request *req);
> --
> 2.5.0
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix a warning message problem during driver unload

2015-12-18 Thread Yu Dai



On 12/18/2015 01:55 AM, Jani Nikula wrote:

On Thu, 17 Dec 2015, yu@intel.com wrote:
> From: Alex Dai 
>
> The device struct_mutex needs to be held before releasing any GEM
> objects allocated by GuC.

This is indeed so, but your patch subject needs to say it fixes an
actual bug rather than a "warning message problem" which makes one think
it's benign.

Also, if you see a warning splat in dmesg, please include that in the
commit message when you fix it. It's *much* easier to match bug reports
with fixes when you have that.

I will change subject to something like "Fix a potential issue ..." and 
with a warning message log in comment.


Thanks,
Alex

>
> Signed-off-by: Alex Dai 
> ---
>  drivers/gpu/drm/i915/intel_guc_loader.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
> index 625272f4..4748651 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -631,10 +631,10 @@ void intel_guc_ucode_fini(struct drm_device *dev)
>struct drm_i915_private *dev_priv = dev->dev_private;
>struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
>
> +  mutex_lock(>struct_mutex);
>direct_interrupts_to_host(dev_priv);
>i915_guc_submission_fini(dev);
>
> -  mutex_lock(>struct_mutex);
>if (guc_fw->guc_fw_obj)
>drm_gem_object_unreference(_fw->guc_fw_obj->base);
>guc_fw->guc_fw_obj = NULL;



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1] drm/i915/guc: Fix a potential issue during driver unload

2015-12-18 Thread yu . dai
From: Alex Dai 

The device struct_mutex needs to be held before releasing any GEM
objects allocated by GuC.

WARNING: CPU: 0 PID: 1575 at include/drm/drm_gem.h:217 
gem_release_guc_obj+0x5f/0x70 [i915]()
Call Trace:
 [] dump_stack+0x44/0x60
 [] warn_slowpath_common+0x82/0xc0
 [] warn_slowpath_null+0x1a/0x20
 [] gem_release_guc_obj+0x5f/0x70 [i915]
 [] i915_guc_submission_fini+0x1a/0x70 [i915]
 [] intel_guc_ucode_fini+0x29/0xa0 [i915]
 [] i915_driver_unload+0x14d/0x290 [i915]
 [] drm_dev_unregister+0x29/0xb0 [drm]
 [] drm_put_dev+0x23/0x60 [drm]
 [] i915_pci_remove+0x15/0x20 [i915]
 [] pci_device_remove+0x39/0xc0
 [] __device_release_driver+0xa1/0x150
 [] driver_detach+0xb5/0xc0
 [] bus_remove_driver+0x55/0xd0
 [] driver_unregister+0x2c/0x50
 [] pci_unregister_driver+0x29/0x90
 [] drm_pci_exit+0x94/0xb0 [drm]
 [] i915_exit+0x20/0xf5c [i915]
 [] SyS_delete_module+0x1b5/0x210
 [] entry_SYSCALL_64_fastpath+0x16/0x75
---[ end trace f711c4eb63588bb7 ]---

v1: Add backtrace log.

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 625272f4..4748651 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -631,10 +631,10 @@ void intel_guc_ucode_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
 
+   mutex_lock(>struct_mutex);
direct_interrupts_to_host(dev_priv);
i915_guc_submission_fini(dev);
 
-   mutex_lock(>struct_mutex);
if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 4/5] drm/i915/guc: Add GuC ADS - MMIO reg state

2015-12-18 Thread yu . dai
From: Alex Dai 

GuC needs to know which registers and how they will be saved and
restored during event such as engine reset or power state changes.
For now only the base address of reg state is initialized. The
detail register table probably will be setup in future GuC TDR or
Preemption patch series.

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 31a407b..40cb4ba 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -870,12 +870,15 @@ static void guc_create_ads(struct intel_guc *guc)
struct drm_i915_gem_object *obj;
struct guc_ads *ads;
struct guc_policies *policies;
+   struct guc_mmio_reg_state *reg_state;
struct intel_engine_cs *ring;
struct page *page;
u32 size, i;
 
/* The ads obj includes the struct itself and buffers passed to GuC */
-   size = sizeof(struct guc_ads) + sizeof(struct guc_policies);
+   size = sizeof(struct guc_ads) + sizeof(struct guc_policies) +
+   sizeof(struct guc_mmio_reg_state) +
+   GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE;
 
obj = guc->ads_obj;
if (!obj) {
@@ -909,6 +912,23 @@ static void guc_create_ads(struct intel_guc *guc)
ads->scheduler_policies = i915_gem_obj_ggtt_offset(obj) +
sizeof(struct guc_ads);
 
+   /* MMIO reg state */
+   reg_state = (void *)policies + sizeof(struct guc_policies);
+
+   for (i = 0; i < I915_NUM_RINGS; i++) {
+   reg_state->mmio_white_list[i].mmio_start =
+   dev_priv->ring[i].mmio_base + GUC_MMIO_WHITE_LIST_START;
+
+   /* Nothing to be saved or restored for now. */
+   reg_state->mmio_white_list[i].count = 0;
+   }
+
+   ads->reg_state_addr = ads->scheduler_policies +
+   sizeof(struct guc_policies);
+
+   ads->reg_state_buffer = ads->reg_state_addr +
+   sizeof(struct guc_mmio_reg_state);
+
kunmap(page);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 0cc17c7..1bb6410 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -361,6 +361,43 @@ struct guc_policies {
u32 reserved[19];
 } __packed;
 
+/* GuC MMIO reg state struct */
+
+#define GUC_REGSET_FLAGS_NONE  0x0
+#define GUC_REGSET_POWERCYCLE  0x1
+#define GUC_REGSET_MASKED  0x2
+#define GUC_REGSET_ENGINERESET 0x4
+#define GUC_REGSET_SAVE_DEFAULT_VALUE  0x8
+#define GUC_REGSET_SAVE_CURRENT_VALUE  0x10
+
+#define GUC_REGSET_MAX_REGISTERS   20
+#define GUC_MMIO_WHITE_LIST_START  0x24d0
+#define GUC_MMIO_WHITE_LIST_MAX12
+#define GUC_S3_SAVE_SPACE_PAGES10
+
+struct guc_mmio_regset {
+   struct __packed {
+   u32 offset;
+   u32 value;
+   u32 flags;
+   } registers[GUC_REGSET_MAX_REGISTERS];
+
+   u32 values_valid;
+   u32 number_of_registers;
+} __packed;
+
+struct guc_mmio_reg_state {
+   struct guc_mmio_regset global_reg;
+   struct guc_mmio_regset engine_reg[I915_NUM_RINGS];
+
+   /* MMIO registers that are set as non privileged */
+   struct __packed {
+   u32 mmio_start;
+   u32 offsets[GUC_MMIO_WHITE_LIST_MAX];
+   u32 count;
+   } mmio_white_list[I915_NUM_RINGS];
+} __packed;
+
 /* GuC Additional Data Struct */
 
 struct guc_ads {
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 3/5] drm/i915/guc: Add GuC ADS - scheduler policies

2015-12-18 Thread yu . dai
From: Alex Dai 

GuC supports different scheduling policies for its four internal
queues. Currently these have been set to the same default values
as KMD_NORMAL queue.

Particularly POLICY_MAX_NUM_WI is set to 15 to match GuC internal
maximum submit queue numbers to avoid an out-of-space problem.
This value indicates max number of work items allowed to be queued
for one DPC process. A smaller value will let GuC schedule more
frequently while a larger number may increase chances to optimize
cmds (such as collapse cmds from same lrc) with risks that keeps
CS idle.

v1: tidy up code

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index d9b9390..31a407b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -842,17 +842,40 @@ static void guc_create_log(struct intel_guc *guc)
guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
+static void init_guc_policies(struct guc_policies *policies)
+{
+   struct guc_policy *policy;
+   u32 p, i;
+
+   policies->dpc_promote_time = 50;
+   policies->max_num_work_items = POLICY_MAX_NUM_WI;
+
+   for (p = 0; p < GUC_CTX_PRIORITY_NUM; p++) {
+   for (i = 0; i < I915_NUM_RINGS; i++) {
+   policy = >policy[p][i];
+
+   policy->execution_quantum = 100;
+   policy->preemption_time = 50;
+   policy->fault_time = 25;
+   policy->policy_flags = 0;
+   }
+   }
+
+   policies->is_valid = 1;
+}
+
 static void guc_create_ads(struct intel_guc *guc)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct drm_i915_gem_object *obj;
struct guc_ads *ads;
+   struct guc_policies *policies;
struct intel_engine_cs *ring;
struct page *page;
u32 size, i;
 
/* The ads obj includes the struct itself and buffers passed to GuC */
-   size = sizeof(struct guc_ads);
+   size = sizeof(struct guc_ads) + sizeof(struct guc_policies);
 
obj = guc->ads_obj;
if (!obj) {
@@ -879,6 +902,13 @@ static void guc_create_ads(struct intel_guc *guc)
for_each_ring(ring, dev_priv, i)
ads->eng_state_size[i] = intel_lr_context_size(ring);
 
+   /* GuC scheduling policies */
+   policies = (void *)ads + sizeof(struct guc_ads);
+   init_guc_policies(policies);
+
+   ads->scheduler_policies = i915_gem_obj_ggtt_offset(obj) +
+   sizeof(struct guc_ads);
+
kunmap(page);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 76ecc85..0cc17c7 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -39,6 +39,7 @@
 #define GUC_CTX_PRIORITY_HIGH  1
 #define GUC_CTX_PRIORITY_KMD_NORMAL2
 #define GUC_CTX_PRIORITY_NORMAL3
+#define GUC_CTX_PRIORITY_NUM   4
 
 #define GUC_MAX_GPU_CONTEXTS   1024
 #defineGUC_INVALID_CTX_ID  GUC_MAX_GPU_CONTEXTS
@@ -316,6 +317,50 @@ struct guc_context_desc {
 #define GUC_POWER_D2   3
 #define GUC_POWER_D3   4
 
+/* Scheduling policy settings */
+
+/* Reset engine upon preempt failure */
+#define POLICY_RESET_ENGINE(1<<0)
+/* Preempt to idle on quantum expiry */
+#define POLICY_PREEMPT_TO_IDLE (1<<1)
+
+#define POLICY_MAX_NUM_WI  15
+
+struct guc_policy {
+   /* Time for one workload to execute. (in micro seconds) */
+   u32 execution_quantum;
+   u32 reserved1;
+
+   /* Time to wait for a preemption request to completed before issuing a
+* reset. (in micro seconds). */
+   u32 preemption_time;
+
+   /* How much time to allow to run after the first fault is observed.
+* Then preempt afterwards. (in micro seconds) */
+   u32 fault_time;
+
+   u32 policy_flags;
+   u32 reserved[2];
+} __packed;
+
+struct guc_policies {
+   struct guc_policy policy[GUC_CTX_PRIORITY_NUM][I915_NUM_RINGS];
+
+   /* In micro seconds. How much time to allow before DPC processing is
+* called back via interrupt (to prevent DPC queue drain starving).
+* Typically 1000s of micro seconds (example only, not granularity). */
+   u32 dpc_promote_time;
+
+   /* Must be set to take these new values. */
+   u32 is_valid;
+
+   /* Max number of WIs to process per call. A large value may keep CS
+* idle. */
+   u32 max_num_work_items;
+
+   u32 reserved[19];
+} __packed;
+
 /* GuC Additional Data Struct */
 
 struct guc_ads {
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 5/5] drm/i915/guc: Add GuC ADS - enabling ADS

2015-12-18 Thread yu . dai
From: Alex Dai 

Set ADS enabling flag during GuC init.

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 4740949..625272f4 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -165,6 +165,13 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
}
 
+   if (guc->ads_obj) {
+   u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj)
+   >> PAGE_SHIFT;
+   params[GUC_CTL_DEBUG] |= ads << GUC_ADS_ADDR_SHIFT;
+   params[GUC_CTL_DEBUG] |= GUC_ADS_ENABLED;
+   }
+
/* If GuC submission is enabled, set up additional parameters here */
if (i915.enable_guc_submission) {
u32 pgs = i915_gem_obj_ggtt_offset(dev_priv->guc.ctx_pool_obj);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 2/5] drm/i915/guc: Add GuC ADS (Addition Data Structure) - allocation

2015-12-18 Thread yu . dai
From: Alex Dai 

The GuC firmware uses this for various purposes. The ADS itself is
a chunk of memory created by driver to share with GuC. Its members
are usually addresses telling where GuC to access them, including
things like scheduler policies, register list that will be saved
and restored during reset etc.

This is the first patch of a series to enable GuC ADS. For now, we
only create the ADS obj whilst keep it disabled.

v1: remove dead code checking return of kmap_atomic (Chris Wilson)
v2: use kmap instead of the atomic version of it.

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index 90a84b4..8d27c09 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -40,6 +40,7 @@
 #define   GS_MIA_CORE_STATE  (1 << GS_MIA_SHIFT)
 
 #define SOFT_SCRATCH(n)_MMIO(0xc180 + (n) * 4)
+#define SOFT_SCRATCH_COUNT 16
 
 #define UOS_RSA_SCRATCH(i) _MMIO(0xc200 + (i) * 4)
 #define   UOS_RSA_SCRATCH_MAX_COUNT  64
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 7554d16..d9b9390 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -842,6 +842,46 @@ static void guc_create_log(struct intel_guc *guc)
guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
+static void guc_create_ads(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct drm_i915_gem_object *obj;
+   struct guc_ads *ads;
+   struct intel_engine_cs *ring;
+   struct page *page;
+   u32 size, i;
+
+   /* The ads obj includes the struct itself and buffers passed to GuC */
+   size = sizeof(struct guc_ads);
+
+   obj = guc->ads_obj;
+   if (!obj) {
+   obj = gem_allocate_guc_obj(dev_priv->dev, PAGE_ALIGN(size));
+   if (!obj)
+   return;
+
+   guc->ads_obj = obj;
+   }
+
+   page = i915_gem_object_get_page(obj, 0);
+   ads = kmap(page);
+
+   /*
+* The GuC requires a "Golden Context" when it reinitialises
+* engines after a reset. Here we use the Render ring default
+* context, which must already exist and be pinned in the GGTT,
+* so its address won't change after we've told the GuC where
+* to find it.
+*/
+   ring = _priv->ring[RCS];
+   ads->golden_context_lrca = ring->status_page.gfx_addr;
+
+   for_each_ring(ring, dev_priv, i)
+   ads->eng_state_size[i] = intel_lr_context_size(ring);
+
+   kunmap(page);
+}
+
 /*
  * Set up the memory resources to be shared with the GuC.  At this point,
  * we require just one object that can be mapped through the GGTT.
@@ -868,6 +908,8 @@ int i915_guc_submission_init(struct drm_device *dev)
 
guc_create_log(guc);
 
+   guc_create_ads(guc);
+
return 0;
 }
 
@@ -906,6 +948,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc *guc = _priv->guc;
 
+   gem_release_guc_obj(dev_priv->guc.ads_obj);
+   guc->ads_obj = NULL;
+
gem_release_guc_obj(dev_priv->guc.log_obj);
guc->log_obj = NULL;
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5cf555d..5c9f894 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -89,6 +89,8 @@ struct intel_guc {
uint32_t log_flags;
struct drm_i915_gem_object *log_obj;
 
+   struct drm_i915_gem_object *ads_obj;
+
struct drm_i915_gem_object *ctx_pool_obj;
struct ida ctx_ids;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index eaa50a4..76ecc85 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -81,11 +81,14 @@
 #define GUC_CTL_CTXINFO0
 #define   GUC_CTL_CTXNUM_IN16_SHIFT0
 #define   GUC_CTL_BASE_ADDR_SHIFT  12
+
 #define GUC_CTL_ARAT_HIGH  1
 #define GUC_CTL_ARAT_LOW   2
+
 #define GUC_CTL_DEVICE_INFO3
 #define   GUC_CTL_GTTYPE_SHIFT 0
 #define   GUC_CTL_COREFAMILY_SHIFT 7
+
 #define GUC_CTL_LOG_PARAMS 4
 #define   GUC_LOG_VALID(1 << 0)
 #define   GUC_LOG_NOTIFY_ON_HALF_FULL  (1 << 1)
@@ -97,9 +100,12 @@
 #define   GUC_LOG_ISR_PAGES3
 #define   GUC_LOG_ISR_SHIFT9
 #define   GUC_LOG_BUF_ADDR_SHIFT   12
+
 #define GUC_CTL_PAGE_FAULT_CONTROL 5
+
 #define GUC_CTL_WA 6
 #define   GUC_CTL_WA_UK_BY_DRIVER  (1 << 3)
+
 #define GUC_CTL_FEATURE7
 #define   GUC_CTL_VCS2_ENABLED (1 << 0)
 #define   

[Intel-gfx] [PATCH v2 0/5] Add GuC ADS (Addition Data Structure)

2015-12-18 Thread yu . dai
From: Alex Dai 

The GuC firmware uses this for various purposes. The ADS itself is a chunk of
memory created by driver to share with GuC. This series creates the GuC ADS
object and setup some basic settings for it.

This version addresses some comments from Chris W. Tidy up some code; replace
kmap_atomic by kmap etc.

Alex Dai (4):
  drm/i915/guc: Add GuC ADS (Addition Data Structure) - allocation
  drm/i915/guc: Add GuC ADS - scheduler policies
  drm/i915/guc: Add GuC ADS - MMIO reg state
  drm/i915/guc: Add GuC ADS - enabling ADS

Dave Gordon (1):
  drm/i915/guc: Expose (intel)_lr_context_size()

 drivers/gpu/drm/i915/i915_guc_reg.h|   1 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  95 
 drivers/gpu/drm/i915/intel_guc.h   |   2 +
 drivers/gpu/drm/i915/intel_guc_fwif.h  | 113 -
 drivers/gpu/drm/i915/intel_guc_loader.c|   7 ++
 drivers/gpu/drm/i915/intel_lrc.c   |   4 +-
 drivers/gpu/drm/i915/intel_lrc.h   |   1 +
 7 files changed, 220 insertions(+), 3 deletions(-)

-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 1/5] drm/i915/guc: Expose (intel)_lr_context_size()

2015-12-18 Thread yu . dai
From: Dave Gordon 

The GuC code needs to know the size of a logical context, so we
expose get_lr_context_size(), renaming it intel_lr_context__size()
to fit the naming conventions for nonstatic functions.

For: VIZ-2021
Signed-off-by: Dave Gordon 
Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e5fb8ea..7a6b896 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2399,7 +2399,7 @@ void intel_lr_context_free(struct intel_context *ctx)
}
 }
 
-static uint32_t get_lr_context_size(struct intel_engine_cs *ring)
+uint32_t intel_lr_context_size(struct intel_engine_cs *ring)
 {
int ret = 0;
 
@@ -2467,7 +2467,7 @@ int intel_lr_context_deferred_alloc(struct intel_context 
*ctx,
WARN_ON(ctx->legacy_hw_ctx.rcs_state != NULL);
WARN_ON(ctx->engine[ring->id].state);
 
-   context_size = round_up(get_lr_context_size(ring), 4096);
+   context_size = round_up(intel_lr_context_size(ring), 4096);
 
/* One extra page as the sharing data between driver and GuC */
context_size += PAGE_SIZE * LRC_PPHWSP_PN;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 0b821b9..ae90f86 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -84,6 +84,7 @@ static inline void intel_logical_ring_emit_reg(struct 
intel_ringbuffer *ringbuf,
 #define LRC_STATE_PN   (LRC_PPHWSP_PN + 1)
 
 void intel_lr_context_free(struct intel_context *ctx);
+uint32_t intel_lr_context_size(struct intel_engine_cs *ring);
 int intel_lr_context_deferred_alloc(struct intel_context *ctx,
struct intel_engine_cs *ring);
 void intel_lr_context_unpin(struct drm_i915_gem_request *req);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/5] drm/i915/guc: Add GuC ADS - scheduler policies

2015-12-17 Thread Yu Dai



On 12/16/2015 11:39 PM, Chris Wilson wrote:

On Wed, Dec 16, 2015 at 01:40:53PM -0800, yu@intel.com wrote:
> From: Alex Dai 
>
> GuC supports different scheduling policies for its four internal
> queues. Currently these have been set to the same default values
> as KMD_NORMAL queue.
>
> Particularly POLICY_MAX_NUM_WI is set to 15 to match GuC internal
> maximum submit queue numbers to avoid an out-of-space problem.
> This value indicates max number of work items allowed to be queued
> for one DPC process. A smaller value will let GuC schedule more
> frequently while a larger number may increase chances to optimize
> cmds (such as collapse cmds from same lrc) with risks that keeps
> CS idle.
>
> Signed-off-by: Alex Dai 
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 31 +++-
>  drivers/gpu/drm/i915/intel_guc_fwif.h  | 45 
++
>  2 files changed, 75 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 66d85c3..a5c555c 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -842,17 +842,39 @@ static void guc_create_log(struct intel_guc *guc)
>guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
>  }
>
> +static void init_guc_policies(struct guc_policies *policies)
> +{
> +  struct guc_policy *policy;
> +  u32 p, i;
> +
> +  policies->dpc_promote_time = 50;
> +  policies->max_num_work_items = POLICY_MAX_NUM_WI;
> +
> +  for (p = 0; p < GUC_CTX_PRIORITY_NUM; p++)
> +  for (i = 0; i < I915_NUM_RINGS; i++) {

Please indent this properly.

> +  policy = >policy[p][i];
> +
> +  policy->execution_quantum = 100;
> +  policy->preemption_time = 50;
> +  policy->fault_time = 25;
> +  policy->policy_flags = 0;
> +  }
> +
> +  policies->is_valid = 1;
> +}
> +
>  static void guc_create_ads(struct intel_guc *guc)
>  {
>struct drm_i915_private *dev_priv = guc_to_i915(guc);
>struct drm_i915_gem_object *obj;
>struct guc_ads *ads;
> +  struct guc_policies *policies;
>struct intel_engine_cs *ring;
>struct page *page;
>u32 size, i;
>
>/* The ads obj includes the struct itself and buffers passed to GuC */
> -  size = sizeof(struct guc_ads);
> +  size = sizeof(struct guc_ads) + sizeof(struct guc_policies);
>
>obj = guc->ads_obj;
>if (!obj) {
> @@ -884,6 +906,13 @@ static void guc_create_ads(struct intel_guc *guc)
>for_each_ring(ring, dev_priv, i)
>ads->eng_state_size[i] = intel_lr_context_size(ring);
>
> +  /* GuC scheduling policies */
> +  policies = (void *)ads + sizeof(struct guc_ads);
> +  init_guc_policies(policies);

Please limit atomic context to only the critical section, i.e. don't
make me have to read every single function to check for violations.


Could you clarify this? I am not sure what's the atomic context and 
critical section you mentioned here.


Alex

> +
> +  ads->scheduler_policies = i915_gem_obj_ggtt_offset(obj) +
> +  sizeof(struct guc_ads);
> +
>kunmap_atomic(ads);



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1] drm/i915/guc: Add GuC ADS (Addition Data Structure) - allocation

2015-12-17 Thread yu . dai
From: Alex Dai 

The GuC firmware uses this for various purposes. The ADS itself is
a chunk of memory created by driver to share with GuC. Its members
are usually addresses telling where GuC to access them, including
things like scheduler policies, register list that will be saved
and restored during reset etc.

This is the first patch of a series to enable GuC ADS. For now, we
only create the ADS obj whilst keep it disabled.

v1: remove dead code checking return of kmap_atomic (Chris Wilson)

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index 90a84b4..8d27c09 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -40,6 +40,7 @@
 #define   GS_MIA_CORE_STATE  (1 << GS_MIA_SHIFT)
 
 #define SOFT_SCRATCH(n)_MMIO(0xc180 + (n) * 4)
+#define SOFT_SCRATCH_COUNT 16
 
 #define UOS_RSA_SCRATCH(i) _MMIO(0xc200 + (i) * 4)
 #define   UOS_RSA_SCRATCH_MAX_COUNT  64
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 7554d16..28531e6 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -842,6 +842,46 @@ static void guc_create_log(struct intel_guc *guc)
guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
+static void guc_create_ads(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct drm_i915_gem_object *obj;
+   struct guc_ads *ads;
+   struct intel_engine_cs *ring;
+   struct page *page;
+   u32 size, i;
+
+   /* The ads obj includes the struct itself and buffers passed to GuC */
+   size = sizeof(struct guc_ads);
+
+   obj = guc->ads_obj;
+   if (!obj) {
+   obj = gem_allocate_guc_obj(dev_priv->dev, PAGE_ALIGN(size));
+   if (!obj)
+   return;
+
+   guc->ads_obj = obj;
+   }
+
+   page = i915_gem_object_get_page(obj, 0);
+   ads = kmap_atomic(page);
+
+   /*
+* The GuC requires a "Golden Context" when it reinitialises
+* engines after a reset. Here we use the Render ring default
+* context, which must already exist and be pinned in the GGTT,
+* so its address won't change after we've told the GuC where
+* to find it.
+*/
+   ring = _priv->ring[RCS];
+   ads->golden_context_lrca = ring->status_page.gfx_addr;
+
+   for_each_ring(ring, dev_priv, i)
+   ads->eng_state_size[i] = intel_lr_context_size(ring);
+
+   kunmap_atomic(ads);
+}
+
 /*
  * Set up the memory resources to be shared with the GuC.  At this point,
  * we require just one object that can be mapped through the GGTT.
@@ -868,6 +908,8 @@ int i915_guc_submission_init(struct drm_device *dev)
 
guc_create_log(guc);
 
+   guc_create_ads(guc);
+
return 0;
 }
 
@@ -906,6 +948,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc *guc = _priv->guc;
 
+   gem_release_guc_obj(dev_priv->guc.ads_obj);
+   guc->ads_obj = NULL;
+
gem_release_guc_obj(dev_priv->guc.log_obj);
guc->log_obj = NULL;
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5cf555d..5c9f894 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -89,6 +89,8 @@ struct intel_guc {
uint32_t log_flags;
struct drm_i915_gem_object *log_obj;
 
+   struct drm_i915_gem_object *ads_obj;
+
struct drm_i915_gem_object *ctx_pool_obj;
struct ida ctx_ids;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index eaa50a4..76ecc85 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -81,11 +81,14 @@
 #define GUC_CTL_CTXINFO0
 #define   GUC_CTL_CTXNUM_IN16_SHIFT0
 #define   GUC_CTL_BASE_ADDR_SHIFT  12
+
 #define GUC_CTL_ARAT_HIGH  1
 #define GUC_CTL_ARAT_LOW   2
+
 #define GUC_CTL_DEVICE_INFO3
 #define   GUC_CTL_GTTYPE_SHIFT 0
 #define   GUC_CTL_COREFAMILY_SHIFT 7
+
 #define GUC_CTL_LOG_PARAMS 4
 #define   GUC_LOG_VALID(1 << 0)
 #define   GUC_LOG_NOTIFY_ON_HALF_FULL  (1 << 1)
@@ -97,9 +100,12 @@
 #define   GUC_LOG_ISR_PAGES3
 #define   GUC_LOG_ISR_SHIFT9
 #define   GUC_LOG_BUF_ADDR_SHIFT   12
+
 #define GUC_CTL_PAGE_FAULT_CONTROL 5
+
 #define GUC_CTL_WA 6
 #define   GUC_CTL_WA_UK_BY_DRIVER  (1 << 3)
+
 #define GUC_CTL_FEATURE7
 #define   GUC_CTL_VCS2_ENABLED (1 << 0)
 #define   GUC_CTL_KERNEL_SUBMISSIONS   (1 << 1)
@@ -109,6 +115,7 @@
 

[Intel-gfx] [PATCH v4] drm/i915/guc: Move GuC wq_check_space to alloc_request_extras

2015-12-16 Thread yu . dai
From: Alex Dai 

Split GuC work queue space checking from submission and move it to
ring_alloc_request_extras. The reason is that failure in later
i915_add_request() won't be handled. In the case timeout happens,
driver can return early in order to handle the error.

v1: Move wq_reserve_space to ring_reserve_space
v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
v3: The work queue head pointer is cached by driver now. So we can
quickly return if space is available.
s/reserve/check/g (Dave Gordon)
v4: Update cached wq head after ring doorbell; check wq space before
ring doorbell in case unexpected error happens; call wq space
check only when GuC submission is enabled. (Dave Gordon)

Signed-off-by: Alex Dai 

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ef20071..7554d16 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -247,6 +247,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
}
 
+   /* Finally, update the cached copy of the GuC's WQ head */
+   gc->wq_head = desc->head;
+
kunmap_atomic(base);
return ret;
 }
@@ -472,28 +475,30 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_check_space(struct i915_guc_client *gc)
 {
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
int ret = -ETIMEDOUT, timeout_counter = 200;
 
+   if (!gc)
+   return 0;
+
+   /* Quickly return if wq space is available since last time we cache the
+* head position. */
+   if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
+   return 0;
+
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
-   if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-   *offset = gc->wq_tail;
+   gc->wq_head = desc->head;
 
-   /* advance the tail for next workqueue item */
-   gc->wq_tail += size;
-   gc->wq_tail &= gc->wq_size - 1;
-
-   /* this will break the loop */
-   timeout_counter = 0;
+   if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
ret = 0;
+   break;
}
 
if (timeout_counter)
@@ -511,12 +516,16 @@ static int guc_add_workqueue_item(struct i915_guc_client 
*gc,
enum intel_ring_id ring_id = rq->ring->id;
struct guc_wq_item *wqi;
void *base;
-   u32 tail, wq_len, wq_off = 0;
-   int ret;
+   u32 tail, wq_len, wq_off, space;
+
+   space = CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size);
+   if (WARN_ON(space < sizeof(struct guc_wq_item)))
+   return -ENOSPC; /* shouldn't happen */
 
-   ret = guc_get_workqueue_space(gc, _off);
-   if (ret)
-   return ret;
+   /* postincrement WQ tail for next time */
+   wq_off = gc->wq_tail;
+   gc->wq_tail += sizeof(struct guc_wq_item);
+   gc->wq_tail &= gc->wq_size - 1;
 
/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 * should not have the case where structure wqi is across page, neither
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..5cf555d 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -43,6 +43,7 @@ struct i915_guc_client {
uint32_t wq_offset;
uint32_t wq_size;
uint32_t wq_tail;
+   uint32_t wq_head;
 
/* GuC submission statistics & status */
uint64_t submissions[I915_NUM_RINGS];
@@ -123,5 +124,6 @@ int i915_guc_submit(struct i915_guc_client *client,
struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_check_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 272f36f..cd232d2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -667,6 +667,19 @@ int intel_logical_ring_alloc_request_extras(struct 
drm_i915_gem_request *request
return ret;
}
 
+   if (i915.enable_guc_submission) {
+   /*
+* Check that the GuC has space for the request before
+* 

[Intel-gfx] [PATCH 3/5] drm/i915/guc: Add GuC ADS - scheduler policies

2015-12-16 Thread yu . dai
From: Alex Dai 

GuC supports different scheduling policies for its four internal
queues. Currently these have been set to the same default values
as KMD_NORMAL queue.

Particularly POLICY_MAX_NUM_WI is set to 15 to match GuC internal
maximum submit queue numbers to avoid an out-of-space problem.
This value indicates max number of work items allowed to be queued
for one DPC process. A smaller value will let GuC schedule more
frequently while a larger number may increase chances to optimize
cmds (such as collapse cmds from same lrc) with risks that keeps
CS idle.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 +++-
 drivers/gpu/drm/i915/intel_guc_fwif.h  | 45 ++
 2 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 66d85c3..a5c555c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -842,17 +842,39 @@ static void guc_create_log(struct intel_guc *guc)
guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
+static void init_guc_policies(struct guc_policies *policies)
+{
+   struct guc_policy *policy;
+   u32 p, i;
+
+   policies->dpc_promote_time = 50;
+   policies->max_num_work_items = POLICY_MAX_NUM_WI;
+
+   for (p = 0; p < GUC_CTX_PRIORITY_NUM; p++)
+   for (i = 0; i < I915_NUM_RINGS; i++) {
+   policy = >policy[p][i];
+
+   policy->execution_quantum = 100;
+   policy->preemption_time = 50;
+   policy->fault_time = 25;
+   policy->policy_flags = 0;
+   }
+
+   policies->is_valid = 1;
+}
+
 static void guc_create_ads(struct intel_guc *guc)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct drm_i915_gem_object *obj;
struct guc_ads *ads;
+   struct guc_policies *policies;
struct intel_engine_cs *ring;
struct page *page;
u32 size, i;
 
/* The ads obj includes the struct itself and buffers passed to GuC */
-   size = sizeof(struct guc_ads);
+   size = sizeof(struct guc_ads) + sizeof(struct guc_policies);
 
obj = guc->ads_obj;
if (!obj) {
@@ -884,6 +906,13 @@ static void guc_create_ads(struct intel_guc *guc)
for_each_ring(ring, dev_priv, i)
ads->eng_state_size[i] = intel_lr_context_size(ring);
 
+   /* GuC scheduling policies */
+   policies = (void *)ads + sizeof(struct guc_ads);
+   init_guc_policies(policies);
+
+   ads->scheduler_policies = i915_gem_obj_ggtt_offset(obj) +
+   sizeof(struct guc_ads);
+
kunmap_atomic(ads);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 76ecc85..0cc17c7 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -39,6 +39,7 @@
 #define GUC_CTX_PRIORITY_HIGH  1
 #define GUC_CTX_PRIORITY_KMD_NORMAL2
 #define GUC_CTX_PRIORITY_NORMAL3
+#define GUC_CTX_PRIORITY_NUM   4
 
 #define GUC_MAX_GPU_CONTEXTS   1024
 #defineGUC_INVALID_CTX_ID  GUC_MAX_GPU_CONTEXTS
@@ -316,6 +317,50 @@ struct guc_context_desc {
 #define GUC_POWER_D2   3
 #define GUC_POWER_D3   4
 
+/* Scheduling policy settings */
+
+/* Reset engine upon preempt failure */
+#define POLICY_RESET_ENGINE(1<<0)
+/* Preempt to idle on quantum expiry */
+#define POLICY_PREEMPT_TO_IDLE (1<<1)
+
+#define POLICY_MAX_NUM_WI  15
+
+struct guc_policy {
+   /* Time for one workload to execute. (in micro seconds) */
+   u32 execution_quantum;
+   u32 reserved1;
+
+   /* Time to wait for a preemption request to completed before issuing a
+* reset. (in micro seconds). */
+   u32 preemption_time;
+
+   /* How much time to allow to run after the first fault is observed.
+* Then preempt afterwards. (in micro seconds) */
+   u32 fault_time;
+
+   u32 policy_flags;
+   u32 reserved[2];
+} __packed;
+
+struct guc_policies {
+   struct guc_policy policy[GUC_CTX_PRIORITY_NUM][I915_NUM_RINGS];
+
+   /* In micro seconds. How much time to allow before DPC processing is
+* called back via interrupt (to prevent DPC queue drain starving).
+* Typically 1000s of micro seconds (example only, not granularity). */
+   u32 dpc_promote_time;
+
+   /* Must be set to take these new values. */
+   u32 is_valid;
+
+   /* Max number of WIs to process per call. A large value may keep CS
+* idle. */
+   u32 max_num_work_items;
+
+   u32 reserved[19];
+} __packed;
+
 /* GuC Additional Data Struct */
 
 struct guc_ads {
-- 
2.5.0

___
Intel-gfx mailing list

[Intel-gfx] [PATCH 0/5] Add GuC ADS (Addition Data Structure)

2015-12-16 Thread yu . dai
From: Alex Dai 

The GuC firmware uses this for various purposes. The ADS itself is a chunk of
memory created by driver to share with GuC. This series creates the GuC ADS
object and setup some basic settings for it.

Alex Dai (4):
  drm/i915/guc: Add GuC ADS (Addition Data Structure) - allocation
  drm/i915/guc: Add GuC ADS - scheduler policies
  drm/i915/guc: Add GuC ADS - MMIO reg state
  drm/i915/guc: Add GuC ADS - enabling ADS

Dave Gordon (1):
  drm/i915/guc: Expose (intel)_lr_context_size()

 drivers/gpu/drm/i915/i915_guc_reg.h|   1 +
 drivers/gpu/drm/i915/i915_guc_submission.c |  99 +
 drivers/gpu/drm/i915/intel_guc.h   |   2 +
 drivers/gpu/drm/i915/intel_guc_fwif.h  | 113 -
 drivers/gpu/drm/i915/intel_guc_loader.c|   7 ++
 drivers/gpu/drm/i915/intel_lrc.c   |   4 +-
 drivers/gpu/drm/i915/intel_lrc.h   |   1 +
 7 files changed, 224 insertions(+), 3 deletions(-)

-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/5] drm/i915/guc: Add GuC ADS - MMIO reg state

2015-12-16 Thread yu . dai
From: Alex Dai 

GuC needs to know which registers and how they will be saved and
restored during event such as engine reset or power state changes.
For now only the base address of reg state is initialized. The
detail register table probably will be setup in future GuC TDR or
Preemption patch series.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 22 +-
 drivers/gpu/drm/i915/intel_guc_fwif.h  | 37 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index a5c555c..92b8a34 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -869,12 +869,15 @@ static void guc_create_ads(struct intel_guc *guc)
struct drm_i915_gem_object *obj;
struct guc_ads *ads;
struct guc_policies *policies;
+   struct guc_mmio_reg_state *reg_state;
struct intel_engine_cs *ring;
struct page *page;
u32 size, i;
 
/* The ads obj includes the struct itself and buffers passed to GuC */
-   size = sizeof(struct guc_ads) + sizeof(struct guc_policies);
+   size = sizeof(struct guc_ads) + sizeof(struct guc_policies) +
+   sizeof(struct guc_mmio_reg_state) +
+   GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE;
 
obj = guc->ads_obj;
if (!obj) {
@@ -913,6 +916,23 @@ static void guc_create_ads(struct intel_guc *guc)
ads->scheduler_policies = i915_gem_obj_ggtt_offset(obj) +
sizeof(struct guc_ads);
 
+   /* MMIO reg state */
+   reg_state = (void *)policies + sizeof(struct guc_policies);
+
+   for (i = 0; i < I915_NUM_RINGS; i++) {
+   reg_state->mmio_white_list[i].mmio_start =
+   dev_priv->ring[i].mmio_base + GUC_MMIO_WHITE_LIST_START;
+
+   /* Nothing to be saved or restored for now. */
+   reg_state->mmio_white_list[i].count = 0;
+   }
+
+   ads->reg_state_addr = ads->scheduler_policies +
+   sizeof(struct guc_policies);
+
+   ads->reg_state_buffer = ads->reg_state_addr +
+   sizeof(struct guc_mmio_reg_state);
+
kunmap_atomic(ads);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 0cc17c7..1bb6410 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -361,6 +361,43 @@ struct guc_policies {
u32 reserved[19];
 } __packed;
 
+/* GuC MMIO reg state struct */
+
+#define GUC_REGSET_FLAGS_NONE  0x0
+#define GUC_REGSET_POWERCYCLE  0x1
+#define GUC_REGSET_MASKED  0x2
+#define GUC_REGSET_ENGINERESET 0x4
+#define GUC_REGSET_SAVE_DEFAULT_VALUE  0x8
+#define GUC_REGSET_SAVE_CURRENT_VALUE  0x10
+
+#define GUC_REGSET_MAX_REGISTERS   20
+#define GUC_MMIO_WHITE_LIST_START  0x24d0
+#define GUC_MMIO_WHITE_LIST_MAX12
+#define GUC_S3_SAVE_SPACE_PAGES10
+
+struct guc_mmio_regset {
+   struct __packed {
+   u32 offset;
+   u32 value;
+   u32 flags;
+   } registers[GUC_REGSET_MAX_REGISTERS];
+
+   u32 values_valid;
+   u32 number_of_registers;
+} __packed;
+
+struct guc_mmio_reg_state {
+   struct guc_mmio_regset global_reg;
+   struct guc_mmio_regset engine_reg[I915_NUM_RINGS];
+
+   /* MMIO registers that are set as non privileged */
+   struct __packed {
+   u32 mmio_start;
+   u32 offsets[GUC_MMIO_WHITE_LIST_MAX];
+   u32 count;
+   } mmio_white_list[I915_NUM_RINGS];
+} __packed;
+
 /* GuC Additional Data Struct */
 
 struct guc_ads {
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/5] drm/i915/guc: Add GuC ADS (Addition Data Structure) - allocation

2015-12-16 Thread yu . dai
From: Alex Dai 

The GuC firmware uses this for various purposes. The ADS itself is
a chunk of memory created by driver to share with GuC. Its members
are usually addresses telling where GuC to access them, including
things like scheduler policies, register list that will be saved
and restored during reset etc.

This is the first patch of a series to enable GuC ADS. For now, we
only create the ADS obj whilst keep it disabled.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_reg.h|  1 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 50 ++
 drivers/gpu/drm/i915/intel_guc.h   |  2 ++
 drivers/gpu/drm/i915/intel_guc_fwif.h  | 31 +-
 4 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index 90a84b4..8d27c09 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -40,6 +40,7 @@
 #define   GS_MIA_CORE_STATE  (1 << GS_MIA_SHIFT)
 
 #define SOFT_SCRATCH(n)_MMIO(0xc180 + (n) * 4)
+#define SOFT_SCRATCH_COUNT 16
 
 #define UOS_RSA_SCRATCH(i) _MMIO(0xc200 + (i) * 4)
 #define   UOS_RSA_SCRATCH_MAX_COUNT  64
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 7554d16..66d85c3 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -842,6 +842,51 @@ static void guc_create_log(struct intel_guc *guc)
guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
 }
 
+static void guc_create_ads(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct drm_i915_gem_object *obj;
+   struct guc_ads *ads;
+   struct intel_engine_cs *ring;
+   struct page *page;
+   u32 size, i;
+
+   /* The ads obj includes the struct itself and buffers passed to GuC */
+   size = sizeof(struct guc_ads);
+
+   obj = guc->ads_obj;
+   if (!obj) {
+   obj = gem_allocate_guc_obj(dev_priv->dev, PAGE_ALIGN(size));
+   if (!obj)
+   return;
+
+   guc->ads_obj = obj;
+   }
+
+   page = i915_gem_object_get_page(obj, 0);
+   ads = kmap_atomic(page);
+   if (!ads) {
+   guc->ads_obj = NULL;
+   gem_release_guc_obj(obj);
+   return;
+   }
+
+   /*
+* The GuC requires a "Golden Context" when it reinitialises
+* engines after a reset. Here we use the Render ring default
+* context, which must already exist and be pinned in the GGTT,
+* so its address won't change after we've told the GuC where
+* to find it.
+*/
+   ring = _priv->ring[RCS];
+   ads->golden_context_lrca = ring->status_page.gfx_addr;
+
+   for_each_ring(ring, dev_priv, i)
+   ads->eng_state_size[i] = intel_lr_context_size(ring);
+
+   kunmap_atomic(ads);
+}
+
 /*
  * Set up the memory resources to be shared with the GuC.  At this point,
  * we require just one object that can be mapped through the GGTT.
@@ -868,6 +913,8 @@ int i915_guc_submission_init(struct drm_device *dev)
 
guc_create_log(guc);
 
+   guc_create_ads(guc);
+
return 0;
 }
 
@@ -906,6 +953,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc *guc = _priv->guc;
 
+   gem_release_guc_obj(dev_priv->guc.ads_obj);
+   guc->ads_obj = NULL;
+
gem_release_guc_obj(dev_priv->guc.log_obj);
guc->log_obj = NULL;
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5cf555d..5c9f894 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -89,6 +89,8 @@ struct intel_guc {
uint32_t log_flags;
struct drm_i915_gem_object *log_obj;
 
+   struct drm_i915_gem_object *ads_obj;
+
struct drm_i915_gem_object *ctx_pool_obj;
struct ida ctx_ids;
 
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index eaa50a4..76ecc85 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -81,11 +81,14 @@
 #define GUC_CTL_CTXINFO0
 #define   GUC_CTL_CTXNUM_IN16_SHIFT0
 #define   GUC_CTL_BASE_ADDR_SHIFT  12
+
 #define GUC_CTL_ARAT_HIGH  1
 #define GUC_CTL_ARAT_LOW   2
+
 #define GUC_CTL_DEVICE_INFO3
 #define   GUC_CTL_GTTYPE_SHIFT 0
 #define   GUC_CTL_COREFAMILY_SHIFT 7
+
 #define GUC_CTL_LOG_PARAMS 4
 #define   GUC_LOG_VALID(1 << 0)
 #define   GUC_LOG_NOTIFY_ON_HALF_FULL  (1 << 1)
@@ -97,9 +100,12 @@
 #define   GUC_LOG_ISR_PAGES3
 #define   

[Intel-gfx] [PATCH 1/5] drm/i915/guc: Expose (intel)_lr_context_size()

2015-12-16 Thread yu . dai
From: Dave Gordon 

The GuC code needs to know the size of a logical context, so we
expose get_lr_context_size(), renaming it intel_lr_context__size()
to fit the naming conventions for nonstatic functions.

For: VIZ-2021
Signed-off-by: Dave Gordon 
Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_lrc.c | 4 ++--
 drivers/gpu/drm/i915/intel_lrc.h | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index cd232d2..bbdcd5d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2482,7 +2482,7 @@ void intel_lr_context_free(struct intel_context *ctx)
}
 }
 
-static uint32_t get_lr_context_size(struct intel_engine_cs *ring)
+uint32_t intel_lr_context_size(struct intel_engine_cs *ring)
 {
int ret = 0;
 
@@ -2550,7 +2550,7 @@ int intel_lr_context_deferred_alloc(struct intel_context 
*ctx,
WARN_ON(ctx->legacy_hw_ctx.rcs_state != NULL);
WARN_ON(ctx->engine[ring->id].state);
 
-   context_size = round_up(get_lr_context_size(ring), 4096);
+   context_size = round_up(intel_lr_context_size(ring), 4096);
 
/* One extra page as the sharing data between driver and GuC */
context_size += PAGE_SIZE * LRC_PPHWSP_PN;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index fd1b6b4..9b2b9bd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -84,6 +84,7 @@ static inline void intel_logical_ring_emit_reg(struct 
intel_ringbuffer *ringbuf,
 #define LRC_STATE_PN   (LRC_PPHWSP_PN + 1)
 
 void intel_lr_context_free(struct intel_context *ctx);
+uint32_t intel_lr_context_size(struct intel_engine_cs *ring);
 int intel_lr_context_deferred_alloc(struct intel_context *ctx,
struct intel_engine_cs *ring);
 void intel_lr_context_unpin(struct drm_i915_gem_request *req);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 5/5] drm/i915/guc: Add GuC ADS - enabling ADS

2015-12-16 Thread yu . dai
From: Alex Dai 

Set ADS enabling flag during GuC init.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 4740949..625272f4 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -165,6 +165,13 @@ static void set_guc_init_params(struct drm_i915_private 
*dev_priv)
i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
}
 
+   if (guc->ads_obj) {
+   u32 ads = (u32)i915_gem_obj_ggtt_offset(guc->ads_obj)
+   >> PAGE_SHIFT;
+   params[GUC_CTL_DEBUG] |= ads << GUC_ADS_ADDR_SHIFT;
+   params[GUC_CTL_DEBUG] |= GUC_ADS_ENABLED;
+   }
+
/* If GuC submission is enabled, set up additional parameters here */
if (i915.enable_guc_submission) {
u32 pgs = i915_gem_obj_ggtt_offset(dev_priv->guc.ctx_pool_obj);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Fix a warning message problem during driver unload

2015-12-16 Thread yu . dai
From: Alex Dai 

The device struct_mutex needs to be held before releasing any GEM
objects allocated by GuC.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 625272f4..4748651 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -631,10 +631,10 @@ void intel_guc_ucode_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;
 
+   mutex_lock(>struct_mutex);
direct_interrupts_to_host(dev_priv);
i915_guc_submission_fini(dev);
 
-   mutex_lock(>struct_mutex);
if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3] drm/i915/guc: Move GuC wq_check_space to alloc_request_extras

2015-12-10 Thread yu . dai
From: Alex Dai 

Split GuC work queue space checking from submission and move it to
ring_alloc_request_extras. The reason is that failure in later
i915_add_request() won't be handled. In the case timeout happens,
driver can return early in order to handle the error.

v1: Move wq_reserve_space to ring_reserve_space
v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
v3: The work queue head pointer is cached by driver now. So we can
quickly return if space is available.
s/reserve/check/g (Dave Gordon)

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 34 ++
 drivers/gpu/drm/i915/intel_guc.h   |  2 ++
 drivers/gpu/drm/i915/intel_lrc.c   |  6 ++
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 226e9c0..cb8e1f71 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -207,6 +207,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
/* Update the tail so it is visible to GuC */
desc->tail = gc->wq_tail;
 
+   /* Cache the head where GuC is processing */
+   gc->wq_head = desc->head;
+
/* current cookie */
db_cmp.db_status = GUC_DOORBELL_ENABLED;
db_cmp.cookie = gc->cookie;
@@ -472,28 +475,30 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_check_space(struct i915_guc_client *gc)
 {
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
int ret = -ETIMEDOUT, timeout_counter = 200;
 
+   if (!gc)
+   return 0;
+
+   /* Quickly return if wq space is available since last time we cache the
+* head position. */
+   if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
+   return 0;
+
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
-   if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-   *offset = gc->wq_tail;
+   gc->wq_head = desc->head;
 
-   /* advance the tail for next workqueue item */
-   gc->wq_tail += size;
-   gc->wq_tail &= gc->wq_size - 1;
-
-   /* this will break the loop */
-   timeout_counter = 0;
+   if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
ret = 0;
+   break;
}
 
if (timeout_counter)
@@ -512,11 +517,8 @@ static int guc_add_workqueue_item(struct i915_guc_client 
*gc,
struct guc_wq_item *wqi;
void *base;
u32 tail, wq_len, wq_off = 0;
-   int ret;
 
-   ret = guc_get_workqueue_space(gc, _off);
-   if (ret)
-   return ret;
+   wq_off = gc->wq_tail;
 
/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 * should not have the case where structure wqi is across page, neither
@@ -551,6 +553,10 @@ static int guc_add_workqueue_item(struct i915_guc_client 
*gc,
 
kunmap_atomic(base);
 
+   /* advance the tail for next workqueue item */
+   gc->wq_tail += sizeof(struct guc_wq_item);
+   gc->wq_tail &= gc->wq_size - 1;
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..5cf555d 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -43,6 +43,7 @@ struct i915_guc_client {
uint32_t wq_offset;
uint32_t wq_size;
uint32_t wq_tail;
+   uint32_t wq_head;
 
/* GuC submission statistics & status */
uint64_t submissions[I915_NUM_RINGS];
@@ -123,5 +124,6 @@ int i915_guc_submit(struct i915_guc_client *client,
struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_check_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f96fb51..9605ce9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct 
drm_i915_gem_request *request
return ret;
}
 
+   /* Reserve GuC WQ space here (one request needs one WQ item) because
+* the later i915_add_request() call can't fail. */
+   

Re: [Intel-gfx] [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras

2015-12-10 Thread Yu Dai



On 12/10/2015 09:14 AM, Dave Gordon wrote:

On 09/12/15 18:50, yu@intel.com wrote:
> From: Alex Dai 
>
> Split GuC work queue space reserve from submission and move it to
> ring_alloc_request_extras. The reason is that failure in later
> i915_add_request() won't be handled. In the case timeout happens,
> driver can return early in order to handle the error.
>
> v1: Move wq_reserve_space to ring_reserve_space
> v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
>
> Signed-off-by: Alex Dai 
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 21 +
>   drivers/gpu/drm/i915/intel_guc.h   |  1 +
>   drivers/gpu/drm/i915/intel_lrc.c   |  6 ++
>   3 files changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 226e9c0..f7bd038 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
> sizeof(desc) * client->ctx_index);
>   }
>
> -/* Get valid workqueue item and return it back to offset */
> -static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> +int i915_guc_wq_reserve_space(struct i915_guc_client *gc)

I think the name is misleading, because we don't actually reserve
anything here, just check that there is some free space in the WQ.

(We certainly don't WANT to reserve anything, because that would be
difficult to clean up in the event of submission failure for any other
reason. So I think it's only the name needs changing. Although ...


I was trying to use similar name as ring_reserve_space. It follows the 
same pattern as filling ring buffer - reserve, emit and advance. The 
only difference is it reserves 4 dwords (or checks space) rather than 
various size of bytes for ring buffer case. Maybe using 'check' is 
better for this case because 4 dwords in wq is what we need for 
submission via GuC.

>   {
>struct guc_process_desc *desc;
>void *base;
>u32 size = sizeof(struct guc_wq_item);
>int ret = -ETIMEDOUT, timeout_counter = 200;
>
> +  if (!gc)
> +  return 0;
> +
>base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>desc = base + gc->proc_desc_offset;
>
>while (timeout_counter-- > 0) {
>if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {

... as an alternative strategy, we could cache the calculated freespace
in the client structure; then if we already know there's at least 1 slot
free from last time we checked, we could then just decrement the cached
value and avoid the kmap+spinwait overhead. Only when we reach 0 would
we have to go through this code to refresh our view of desc->head and
recalculate the actual current freespace. [NB: clear cached value on reset?]

Does that sound like a useful optimisation?


I think it is a good idea.

> -  *offset = gc->wq_tail;
> -
> -  /* advance the tail for next workqueue item */
> -  gc->wq_tail += size;
> -  gc->wq_tail &= gc->wq_size - 1;
> -
>/* this will break the loop */
>timeout_counter = 0;
>ret = 0;
> @@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct 
i915_guc_client *gc,
>struct guc_wq_item *wqi;
>void *base;
>u32 tail, wq_len, wq_off = 0;
> -  int ret;
>
> -  ret = guc_get_workqueue_space(gc, _off);
> -  if (ret)
> -  return ret;
> +  wq_off = gc->wq_tail;
> +
> +  /* advance the tail for next workqueue item */
> +  gc->wq_tail += sizeof(struct guc_wq_item);
> +  gc->wq_tail &= gc->wq_size - 1;

I was a bit unhappy about this code just assuming that there *must* be
space (because we KNOW we've checked above) -- unless someone violated
the proper calling sequence (TDR?). OTOH, it would be too expensive to
go through the map-and-calculate code all over again just to catch an
unlikely scenario. But, if we cache the last-calculated value as above,
then the check could be cheap :) For example, just add a pre_checked
size field that's set by the pre-check and then checked and decremented
on submission; there shouldn't be more than one submission in progress
at a time, because dev->struct_mutex is held across the whole sequence
(but it's not an error to see two pre-checks in a row, because a request
can be abandoned partway).


I don't understand the concerns here. As I said above, filling GuC WQ is 
same as filling ring buffer - reserve (check), emit and advance. These 
two lines are GuC version of intel_logical_ring_emit and 
intel_logical_ring_advance. Maybe if I move these two lines to end of 
guc_add_workqueue_item, people can understand it more easily.



>/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
> * should not have the case 

Re: [Intel-gfx] [PATCH v8] drm/i915: Extend LRC pinning to cover GPU context writeback

2015-12-09 Thread Yu Dai

Reviewed-by: Alex Dai 

On 12/07/2015 09:10 AM, Nick Hoath wrote:

Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been written back to by the GPU.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.
This fixes an issue with GuC submission where the GPU might not
have finished writing back the context before it is unpinned. This
results in a GPU hang.

v2: Moved the new pin to cover GuC submission (Alex Dai)
 Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
 context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request
v5: Only create a switch to idle context if the ring doesn't
 already have a request pending on it (Alex Dai)
 Rename unsaved to dirty to avoid double negatives (Dave Gordon)
 Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
 Split out per engine cleanup from context_free as it
 was getting unwieldy
 Corrected locking (Dave Gordon)
v6: Removed some bikeshedding (Mika Kuoppala)
 Added explanation of the GuC hang that this fixes (Daniel Vetter)
v7: Removed extra per request pinning from ring reset code (Alex Dai)
 Added forced ring unpin/clean in error case in context free (Alex Dai)
v8: Renamed lrc specific last_context to lrc_last_context as there
 were some reset cases where the codepaths leaked (Mika Kuoppala)
 NULL'd last_context in reset case - there was a pointer leak
 if someone did reset->close context.
Signed-off-by: Nick Hoath 
Issue: VIZ-4277
Cc: Daniel Vetter 
Cc: David Gordon 
Cc: Chris Wilson 
Cc: Alex Dai 
Cc: Mika Kuoppala 
---
  drivers/gpu/drm/i915/i915_drv.h |   1 +
  drivers/gpu/drm/i915/i915_gem.c |   7 +-
  drivers/gpu/drm/i915/intel_lrc.c| 138 ++--
  drivers/gpu/drm/i915/intel_lrc.h|   1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
  5 files changed, 121 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9ab3e25..a59ca13 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -884,6 +884,7 @@ struct intel_context {
struct {
struct drm_i915_gem_object *state;
struct intel_ringbuffer *ringbuf;
+   bool dirty;
int pin_count;
} engine[I915_NUM_RINGS];
  
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c

index a6997a8..cd27ecc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1362,6 +1362,9 @@ static void i915_gem_request_retire(struct 
drm_i915_gem_request *request)
  {
trace_i915_gem_request_retire(request);
  
+	if (i915.enable_execlists)

+   intel_lr_context_complete_check(request);
+
/* We know the GPU must have read the request to have
 * sent us the seqno + interrupt, so use the position
 * of tail of the request to update the last known position
@@ -2772,10 +2775,6 @@ static void i915_gem_reset_ring_cleanup(struct 
drm_i915_private *dev_priv,
struct drm_i915_gem_request,
execlist_link);
list_del(_req->execlist_link);
-
-   if (submit_req->ctx != ring->default_context)
-   intel_lr_context_unpin(submit_req);
-
i915_gem_request_unreference(submit_req);
}
spin_unlock_irq(>execlist_lock);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4ebafab..f96fb51 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -571,9 +571,6 @@ static int execlists_context_queue(struct 
drm_i915_gem_request *request)
struct drm_i915_gem_request *cursor;
int num_elements = 0;
  
-	if (request->ctx != ring->default_context)

-   intel_lr_context_pin(request);
-
i915_gem_request_reference(request);
  
  	spin_lock_irq(>execlist_lock);

@@ -737,6 +734,13 @@ intel_logical_ring_advance_and_submit(struct 
drm_i915_gem_request *request)
if (intel_ring_stopped(ring))
return;
  
+	if (request->ctx != ring->default_context) {

+   if (!request->ctx->engine[ring->id].dirty) {
+   intel_lr_context_pin(request);
+   request->ctx->engine[ring->id].dirty = true;
+   }
+   }
+
if (dev_priv->guc.execbuf_client)
   

[Intel-gfx] [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras

2015-12-09 Thread yu . dai
From: Alex Dai 

Split GuC work queue space reserve from submission and move it to
ring_alloc_request_extras. The reason is that failure in later
i915_add_request() won't be handled. In the case timeout happens,
driver can return early in order to handle the error.

v1: Move wq_reserve_space to ring_reserve_space
v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 21 +
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 drivers/gpu/drm/i915/intel_lrc.c   |  6 ++
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 226e9c0..f7bd038 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_reserve_space(struct i915_guc_client *gc)
 {
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
int ret = -ETIMEDOUT, timeout_counter = 200;
 
+   if (!gc)
+   return 0;
+
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-   *offset = gc->wq_tail;
-
-   /* advance the tail for next workqueue item */
-   gc->wq_tail += size;
-   gc->wq_tail &= gc->wq_size - 1;
-
/* this will break the loop */
timeout_counter = 0;
ret = 0;
@@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client 
*gc,
struct guc_wq_item *wqi;
void *base;
u32 tail, wq_len, wq_off = 0;
-   int ret;
 
-   ret = guc_get_workqueue_space(gc, _off);
-   if (ret)
-   return ret;
+   wq_off = gc->wq_tail;
+
+   /* advance the tail for next workqueue item */
+   gc->wq_tail += sizeof(struct guc_wq_item);
+   gc->wq_tail &= gc->wq_size - 1;
 
/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 * should not have the case where structure wqi is across page, neither
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..59c8e21 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client,
struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_reserve_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f96fb51..7d53d27 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct 
drm_i915_gem_request *request
return ret;
}
 
+   /* Reserve GuC WQ space here (one request needs one WQ item) because
+* the later i915_add_request() call can't fail. */
+   ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client);
+   if (ret)
+   return ret;
+
return 0;
 }
 
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space

2015-12-09 Thread Yu Dai



On 12/09/2015 01:05 AM, Chris Wilson wrote:

On Tue, Dec 08, 2015 at 05:04:50PM -0800, yu@intel.com wrote:
> From: Alex Dai 
>
> Split GuC work queue space reserve and submission and move the space
> reserve to where ring space is reserved. The reason is that failure
> in intel_logical_ring_advance_and_submit won't be handled. In the
> case timeout happens, driver can return early in order to handle the
> error.

Not here. You want the request_alloc_extras callback.


OK. That is even before touching the ring.

Thanks,
Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space

2015-12-08 Thread yu . dai
From: Alex Dai 

Split GuC work queue space reserve and submission and move the space
reserve to where ring space is reserved. The reason is that failure
in intel_logical_ring_advance_and_submit won't be handled. In the
case timeout happens, driver can return early in order to handle the
error.
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 21 +
 drivers/gpu/drm/i915/intel_guc.h   |  1 +
 drivers/gpu/drm/i915/intel_lrc.c   | 10 +-
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8a71458..264fdf7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_reserve_space(struct i915_guc_client *gc)
 {
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
int ret = -ETIMEDOUT, timeout_counter = 200;
 
+   if (!gc)
+   return 0;
+
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-   *offset = gc->wq_tail;
-
-   /* advance the tail for next workqueue item */
-   gc->wq_tail += size;
-   gc->wq_tail &= gc->wq_size - 1;
-
/* this will break the loop */
timeout_counter = 0;
ret = 0;
@@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client 
*gc,
struct guc_wq_item *wqi;
void *base;
u32 tail, wq_len, wq_off = 0;
-   int ret;
 
-   ret = guc_get_workqueue_space(gc, _off);
-   if (ret)
-   return ret;
+   wq_off = gc->wq_tail;
+
+   /* advance the tail for next workqueue item */
+   gc->wq_tail += sizeof(struct guc_wq_item);
+   gc->wq_tail &= gc->wq_size - 1;
 
/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 * should not have the case where structure wqi is across page, neither
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..59c8e21 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client,
struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_reserve_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f96fb51..25cbeab 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -844,6 +844,8 @@ int intel_logical_ring_begin(struct drm_i915_gem_request 
*req, int num_dwords)
 
 int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
 {
+   int ret;
+
/*
 * The first call merely notes the reserve request and is common for
 * all back ends. The subsequent localised _begin() call actually
@@ -854,7 +856,13 @@ int intel_logical_ring_reserve_space(struct 
drm_i915_gem_request *request)
 */
intel_ring_reserved_space_reserve(request->ringbuf, 
MIN_SPACE_FOR_ADD_REQUEST);
 
-   return intel_logical_ring_begin(request, 0);
+   ret = intel_logical_ring_begin(request, 0);
+   if (ret)
+   return ret;
+
+   ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client);
+
+   return ret;
 }
 
 /**
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v4] drm/i915/guc: Clean up locks in GuC

2015-12-02 Thread yu . dai
From: Alex Dai 

For now, remove the spinlocks that protected the GuC's
statistics block and work queue; they are only accessed
by code that already holds the global struct_mutex, and
so are redundant (until the big struct_mutex rewrite!).

The specific problem that the spinlocks caused was that
if the work queue was full, the driver would try to
spinwait for one jiffy, but with interrupts disabled the
jiffy count would not advance, leading to a system hang.
The issue was found using test case igt/gem_close_race.

The new version will usleep() instead, still holding
the struct_mutex but without any spinlocks.

v4: Reorganize commit message (Dave Gordon)
v3: Remove unnecessary whitespace churn
v2: Clean up wq_lock too
v1: Clean up host2guc lock as well

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 12 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++
 drivers/gpu/drm/i915/intel_guc.h   |  4 
 3 files changed, 12 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff1..d6b7817 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
if (!HAS_GUC_SCHED(dev_priv->dev))
return 0;
 
+   if (mutex_lock_interruptible(>struct_mutex))
+   return 0;
+
/* Take a local copy of the GuC data, so we can dump it at leisure */
-   spin_lock(_priv->guc.host2guc_lock);
guc = dev_priv->guc;
-   if (guc.execbuf_client) {
-   spin_lock(_client->wq_lock);
+   if (guc.execbuf_client)
client = *guc.execbuf_client;
-   spin_unlock(_client->wq_lock);
-   }
-   spin_unlock(_priv->guc.host2guc_lock);
+
+   mutex_unlock(>struct_mutex);
 
seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ed9f100..a7f9785 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
return -EINVAL;
 
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-   spin_lock(_priv->guc.host2guc_lock);
 
dev_priv->guc.action_count += 1;
dev_priv->guc.action_cmd = data[0];
@@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
}
dev_priv->guc.action_status = status;
 
-   spin_unlock(_priv->guc.host2guc_lock);
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
return ret;
@@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct 
intel_guc *guc)
const uint32_t cacheline_size = cache_line_size();
uint32_t offset;
 
-   spin_lock(>host2guc_lock);
-
/* Doorbell uses a single cache line within a page */
offset = offset_in_page(guc->db_cacheline);
 
/* Moving to next cache line to reduce contention */
guc->db_cacheline += cacheline_size;
 
-   spin_unlock(>host2guc_lock);
-
DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize 
%u\n",
offset, guc->db_cacheline, cacheline_size);
 
@@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
const uint16_t end = start + half;
uint16_t id;
 
-   spin_lock(>host2guc_lock);
id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
if (id == end)
id = GUC_INVALID_DOORBELL_ID;
else
bitmap_set(guc->doorbell_bitmap, id, 1);
-   spin_unlock(>host2guc_lock);
 
DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
hi_pri ? "high" : "normal", id);
@@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
 
 static void release_doorbell(struct intel_guc *guc, uint16_t id)
 {
-   spin_lock(>host2guc_lock);
bitmap_clear(guc->doorbell_bitmap, id, 1);
-   spin_unlock(>host2guc_lock);
 }
 
 /*
@@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client 
*gc, u32 *offset)
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
-   int ret = 0, timeout_counter = 200;
+   int ret = -ETIMEDOUT, timeout_counter = 200;
 
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
-   ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-   

Re: [Intel-gfx] [PATCH] drm/i915: Extend LRC pinning to cover GPU context writeback

2015-12-02 Thread Yu Dai
I tested this with GuC submission enabled and it fixes the issue I found 
during GPU reset.


Reviewed-by: Alex Dai 

On 12/01/2015 06:48 AM, Nick Hoath wrote:

Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been written back to by the GPU.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.
This fixes an issue with GuC submission where the GPU might not
have finished writing back the context before it is unpinned. This
results in a GPU hang.

v2: Moved the new pin to cover GuC submission (Alex Dai)
 Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
 context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request
v5: Only create a switch to idle context if the ring doesn't
 already have a request pending on it (Alex Dai)
 Rename unsaved to dirty to avoid double negatives (Dave Gordon)
 Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
 Split out per engine cleanup from context_free as it
 was getting unwieldy
 Corrected locking (Dave Gordon)
v6: Removed some bikeshedding (Mika Kuoppala)
 Added explanation of the GuC hang that this fixes (Daniel Vetter)
v7: Removed extra per request pinning from ring reset code (Alex Dai)
 Added forced ring unpin/clean in error case in context free (Alex Dai)

Signed-off-by: Nick Hoath 
Issue: VIZ-4277
Cc: Daniel Vetter 
Cc: David Gordon 
Cc: Chris Wilson 
Cc: Alex Dai 
Cc: Mika Kuoppala 
---
  drivers/gpu/drm/i915/i915_drv.h  |   1 +
  drivers/gpu/drm/i915/i915_gem.c  |   7 +-
  drivers/gpu/drm/i915/intel_lrc.c | 136 ---
  drivers/gpu/drm/i915/intel_lrc.h |   1 +
  4 files changed, 118 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d5cf30b..e82717a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -889,6 +889,7 @@ struct intel_context {
struct {
struct drm_i915_gem_object *state;
struct intel_ringbuffer *ringbuf;
+   bool dirty;
int pin_count;
} engine[I915_NUM_RINGS];
  
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c

index e955499..69e9d96 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1354,6 +1354,9 @@ static void i915_gem_request_retire(struct 
drm_i915_gem_request *request)
  {
trace_i915_gem_request_retire(request);
  
+	if (i915.enable_execlists)

+   intel_lr_context_complete_check(request);
+
/* We know the GPU must have read the request to have
 * sent us the seqno + interrupt, so use the position
 * of tail of the request to update the last known position
@@ -2765,10 +2768,6 @@ static void i915_gem_reset_ring_cleanup(struct 
drm_i915_private *dev_priv,
struct drm_i915_gem_request,
execlist_link);
list_del(_req->execlist_link);
-
-   if (submit_req->ctx != ring->default_context)
-   intel_lr_context_unpin(submit_req);
-
i915_gem_request_unreference(submit_req);
}
spin_unlock_irq(>execlist_lock);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 06180dc..b4d9c8f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -566,9 +566,6 @@ static int execlists_context_queue(struct 
drm_i915_gem_request *request)
struct drm_i915_gem_request *cursor;
int num_elements = 0;
  
-	if (request->ctx != ring->default_context)

-   intel_lr_context_pin(request);
-
i915_gem_request_reference(request);
  
  	spin_lock_irq(>execlist_lock);

@@ -732,6 +729,13 @@ intel_logical_ring_advance_and_submit(struct 
drm_i915_gem_request *request)
if (intel_ring_stopped(ring))
return;
  
+	if (request->ctx != ring->default_context) {

+   if (!request->ctx->engine[ring->id].dirty) {
+   intel_lr_context_pin(request);
+   request->ctx->engine[ring->id].dirty = true;
+   }
+   }
+
if (dev_priv->guc.execbuf_client)
i915_guc_submit(dev_priv->guc.execbuf_client, request);
else
@@ -958,12 +962,6 @@ void intel_execlists_retire_requests(struct 
intel_engine_cs *ring)
spin_unlock_irq(>execlist_lock);
  
  	

[Intel-gfx] [PATCH v3] drm/i915/guc: Clean up locks in GuC

2015-11-30 Thread yu . dai
From: Alex Dai 

When GuC Work Queue is full, driver will wait GuC for avaliable
space by delaying 1ms. The wait needs to be out of spinlockirq /
unlock. Otherwise, lockup happens because jiffies won't be updated
due to irq is disabled. The unnecessary locks has been cleared.
dev->struct_mutex is used instead where needed.

Issue is found in igt/gem_close_race.

v3: Remove unnecessary whitespace churn
v2: Clean up wq_lock too
v1: Clean up host2guc lock as well

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 12 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++
 drivers/gpu/drm/i915/intel_guc.h   |  4 
 3 files changed, 12 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff1..d6b7817 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
if (!HAS_GUC_SCHED(dev_priv->dev))
return 0;
 
+   if (mutex_lock_interruptible(>struct_mutex))
+   return 0;
+
/* Take a local copy of the GuC data, so we can dump it at leisure */
-   spin_lock(_priv->guc.host2guc_lock);
guc = dev_priv->guc;
-   if (guc.execbuf_client) {
-   spin_lock(_client->wq_lock);
+   if (guc.execbuf_client)
client = *guc.execbuf_client;
-   spin_unlock(_client->wq_lock);
-   }
-   spin_unlock(_priv->guc.host2guc_lock);
+
+   mutex_unlock(>struct_mutex);
 
seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ed9f100..a7f9785 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
return -EINVAL;
 
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-   spin_lock(_priv->guc.host2guc_lock);
 
dev_priv->guc.action_count += 1;
dev_priv->guc.action_cmd = data[0];
@@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
}
dev_priv->guc.action_status = status;
 
-   spin_unlock(_priv->guc.host2guc_lock);
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
return ret;
@@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct 
intel_guc *guc)
const uint32_t cacheline_size = cache_line_size();
uint32_t offset;
 
-   spin_lock(>host2guc_lock);
-
/* Doorbell uses a single cache line within a page */
offset = offset_in_page(guc->db_cacheline);
 
/* Moving to next cache line to reduce contention */
guc->db_cacheline += cacheline_size;
 
-   spin_unlock(>host2guc_lock);
-
DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize 
%u\n",
offset, guc->db_cacheline, cacheline_size);
 
@@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
const uint16_t end = start + half;
uint16_t id;
 
-   spin_lock(>host2guc_lock);
id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
if (id == end)
id = GUC_INVALID_DOORBELL_ID;
else
bitmap_set(guc->doorbell_bitmap, id, 1);
-   spin_unlock(>host2guc_lock);
 
DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
hi_pri ? "high" : "normal", id);
@@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
 
 static void release_doorbell(struct intel_guc *guc, uint16_t id)
 {
-   spin_lock(>host2guc_lock);
bitmap_clear(guc->doorbell_bitmap, id, 1);
-   spin_unlock(>host2guc_lock);
 }
 
 /*
@@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client 
*gc, u32 *offset)
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
-   int ret = 0, timeout_counter = 200;
+   int ret = -ETIMEDOUT, timeout_counter = 200;
 
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
-   ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-   gc->wq_size) >= size, 1);
-
-   if (!ret) {
+   if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
*offset = gc->wq_tail;
 
/* advance the tail for next workqueue item */
@@ -505,7 +492,11 @@ static int 

[Intel-gfx] [PATCH v2] drm/i915/guc: Clean up locks in GuC

2015-11-25 Thread yu . dai
From: Alex Dai 

When GuC Work Queue is full, driver will wait GuC for avaliable
space by delaying 1ms. The wait needs to be out of spinlockirq /
unlock. Otherwise, lockup happens because jiffies won't be updated
dur to irq is disabled. The unnecessary locks has been cleared.
dev->struct_mutex is used instead where needed.

Issue is found in igt/gem_close_race.

v2: Clean up wq_lock too
v1: Clean up host2guc lock as well

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 12 +--
 drivers/gpu/drm/i915/i915_guc_submission.c | 32 +++---
 drivers/gpu/drm/i915/intel_guc.h   |  4 
 3 files changed, 13 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff1..d6b7817 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
if (!HAS_GUC_SCHED(dev_priv->dev))
return 0;
 
+   if (mutex_lock_interruptible(>struct_mutex))
+   return 0;
+
/* Take a local copy of the GuC data, so we can dump it at leisure */
-   spin_lock(_priv->guc.host2guc_lock);
guc = dev_priv->guc;
-   if (guc.execbuf_client) {
-   spin_lock(_client->wq_lock);
+   if (guc.execbuf_client)
client = *guc.execbuf_client;
-   spin_unlock(_client->wq_lock);
-   }
-   spin_unlock(_priv->guc.host2guc_lock);
+
+   mutex_unlock(>struct_mutex);
 
seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ed9f100..97996e5 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
return -EINVAL;
 
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-   spin_lock(_priv->guc.host2guc_lock);
 
dev_priv->guc.action_count += 1;
dev_priv->guc.action_cmd = data[0];
@@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
}
dev_priv->guc.action_status = status;
 
-   spin_unlock(_priv->guc.host2guc_lock);
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
return ret;
@@ -249,6 +247,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
}
 
kunmap_atomic(base);
+
return ret;
 }
 
@@ -292,16 +291,12 @@ static uint32_t select_doorbell_cacheline(struct 
intel_guc *guc)
const uint32_t cacheline_size = cache_line_size();
uint32_t offset;
 
-   spin_lock(>host2guc_lock);
-
/* Doorbell uses a single cache line within a page */
offset = offset_in_page(guc->db_cacheline);
 
/* Moving to next cache line to reduce contention */
guc->db_cacheline += cacheline_size;
 
-   spin_unlock(>host2guc_lock);
-
DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize 
%u\n",
offset, guc->db_cacheline, cacheline_size);
 
@@ -322,13 +317,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
const uint16_t end = start + half;
uint16_t id;
 
-   spin_lock(>host2guc_lock);
id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
if (id == end)
id = GUC_INVALID_DOORBELL_ID;
else
bitmap_set(guc->doorbell_bitmap, id, 1);
-   spin_unlock(>host2guc_lock);
 
DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
hi_pri ? "high" : "normal", id);
@@ -338,9 +331,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, 
uint32_t priority)
 
 static void release_doorbell(struct intel_guc *guc, uint16_t id)
 {
-   spin_lock(>host2guc_lock);
bitmap_clear(guc->doorbell_bitmap, id, 1);
-   spin_unlock(>host2guc_lock);
 }
 
 /*
@@ -487,16 +478,13 @@ static int guc_get_workqueue_space(struct i915_guc_client 
*gc, u32 *offset)
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
-   int ret = 0, timeout_counter = 200;
+   int ret = -ETIMEDOUT, timeout_counter = 200;
 
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
-   ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-   gc->wq_size) >= size, 1);
-
-   if (!ret) {
+   if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
*offset = gc->wq_tail;
 

Re: [Intel-gfx] [PATCH] drm/i915: Change context lifecycle

2015-11-25 Thread Yu Dai



On 11/25/2015 07:02 AM, Mika Kuoppala wrote:

Nick Hoath  writes:

> Use the first retired request on a new context to unpin
> the old context. This ensures that the hw context remains
> bound until it has been written back to by the GPU.
> Now that the context is pinned until later in the request/context
> lifecycle, it no longer needs to be pinned from context_queue to
> retire_requests.
>
> v2: Moved the new pin to cover GuC submission (Alex Dai)
> Moved the new unpin to request_retire to fix coverage leak
> v3: Added switch to default context if freeing a still pinned
> context just in case the hw was actually still using it
> v4: Unwrapped context unpin to allow calling without a request
> v5: Only create a switch to idle context if the ring doesn't
> already have a request pending on it (Alex Dai)
> Rename unsaved to dirty to avoid double negatives (Dave Gordon)
> Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
> Split out per engine cleanup from context_free as it
> was getting unwieldy
> Corrected locking (Dave Gordon)
>
> Signed-off-by: Nick Hoath 
> Issue: VIZ-4277
> Cc: Daniel Vetter 
> Cc: David Gordon 
> Cc: Chris Wilson 
> Cc: Alex Dai 
> ---
>  drivers/gpu/drm/i915/i915_drv.h  |   1 +
>  drivers/gpu/drm/i915/i915_gem.c  |   3 +
>  drivers/gpu/drm/i915/intel_lrc.c | 124 
+++
>  drivers/gpu/drm/i915/intel_lrc.h |   1 +
>  4 files changed, 105 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d5cf30b..e82717a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -889,6 +889,7 @@ struct intel_context {
>struct {
>struct drm_i915_gem_object *state;
>struct intel_ringbuffer *ringbuf;
> +  bool dirty;
>int pin_count;
>} engine[I915_NUM_RINGS];
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index e955499..3829bc1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1354,6 +1354,9 @@ static void i915_gem_request_retire(struct 
drm_i915_gem_request *request)
>  {
>trace_i915_gem_request_retire(request);
>
> +  if (i915.enable_execlists)
> +  intel_lr_context_complete_check(request);
> +
>/* We know the GPU must have read the request to have
> * sent us the seqno + interrupt, so use the position
> * of tail of the request to update the last known position
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
b/drivers/gpu/drm/i915/intel_lrc.c
> index 06180dc..03d5bca 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -566,9 +566,6 @@ static int execlists_context_queue(struct 
drm_i915_gem_request *request)
>struct drm_i915_gem_request *cursor;
>int num_elements = 0;
>
> -  if (request->ctx != ring->default_context)
> -  intel_lr_context_pin(request);
> -
>i915_gem_request_reference(request);
>
>spin_lock_irq(>execlist_lock);
> @@ -732,6 +729,13 @@ intel_logical_ring_advance_and_submit(struct 
drm_i915_gem_request *request)
>if (intel_ring_stopped(ring))
>return;
>
> +  if (request->ctx != ring->default_context) {
> +  if (!request->ctx->engine[ring->id].dirty) {
> +  intel_lr_context_pin(request);
> +  request->ctx->engine[ring->id].dirty = true;
> +  }
> +  }
> +
>if (dev_priv->guc.execbuf_client)
>i915_guc_submit(dev_priv->guc.execbuf_client, request);
>else
> @@ -958,12 +962,6 @@ void intel_execlists_retire_requests(struct 
intel_engine_cs *ring)
>spin_unlock_irq(>execlist_lock);
>
>list_for_each_entry_safe(req, tmp, _list, execlist_link) {
> -  struct intel_context *ctx = req->ctx;
> -  struct drm_i915_gem_object *ctx_obj =
> -  ctx->engine[ring->id].state;
> -
> -  if (ctx_obj && (ctx != ring->default_context))
> -  intel_lr_context_unpin(req);
>list_del(>execlist_link);
>i915_gem_request_unreference(req);
>}
> @@ -1058,21 +1056,39 @@ reset_pin_count:
>return ret;
>  }
>
> -void intel_lr_context_unpin(struct drm_i915_gem_request *rq)
> +static void __intel_lr_context_unpin(struct intel_engine_cs *ring,
> +  struct intel_context *ctx)
>  {
> -  struct intel_engine_cs *ring = rq->ring;
> -  struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
> -  struct intel_ringbuffer *ringbuf = rq->ringbuf;
> -
> +  struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> +  struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
>if (ctx_obj) {
>WARN_ON(!mutex_is_locked(>dev->struct_mutex));
> -  if 

Re: [Intel-gfx] [PATCH] drm/i915: Change context lifecycle

2015-11-25 Thread Yu Dai
OK, here is my understanding. We do context pin/unpin during create/free 
request or during submit/retire request but with condition (dirty) 
check. So the context pincount will be # of requests plus 1, if it is 
dirty. Dirty means that context likely is accessed by HW, while 
not-dirty means HW is not accessing the lrc at that moment. This extra 
pincount will be held until we receive retire of request from a 
different context.


The switch to idle context and wait looks good to me too. I tested it 
out and it fixes the hang issue when GuC is enabled.


Reviewed-by: Alex Dai 

Thanks,
Alex

On 11/25/2015 04:57 AM, Nick Hoath wrote:

Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been written back to by the GPU.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.

v2: Moved the new pin to cover GuC submission (Alex Dai)
 Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
 context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request
v5: Only create a switch to idle context if the ring doesn't
 already have a request pending on it (Alex Dai)
 Rename unsaved to dirty to avoid double negatives (Dave Gordon)
 Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
 Split out per engine cleanup from context_free as it
 was getting unwieldy
 Corrected locking (Dave Gordon)

Signed-off-by: Nick Hoath 
Issue: VIZ-4277
Cc: Daniel Vetter 
Cc: David Gordon 
Cc: Chris Wilson 
Cc: Alex Dai 
---
  drivers/gpu/drm/i915/i915_drv.h  |   1 +
  drivers/gpu/drm/i915/i915_gem.c  |   3 +
  drivers/gpu/drm/i915/intel_lrc.c | 124 +++
  drivers/gpu/drm/i915/intel_lrc.h |   1 +
  4 files changed, 105 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d5cf30b..e82717a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -889,6 +889,7 @@ struct intel_context {
struct {
struct drm_i915_gem_object *state;
struct intel_ringbuffer *ringbuf;
+   bool dirty;
int pin_count;
} engine[I915_NUM_RINGS];
  
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c

index e955499..3829bc1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1354,6 +1354,9 @@ static void i915_gem_request_retire(struct 
drm_i915_gem_request *request)
  {
trace_i915_gem_request_retire(request);
  
+	if (i915.enable_execlists)

+   intel_lr_context_complete_check(request);
+
/* We know the GPU must have read the request to have
 * sent us the seqno + interrupt, so use the position
 * of tail of the request to update the last known position
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 06180dc..03d5bca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -566,9 +566,6 @@ static int execlists_context_queue(struct 
drm_i915_gem_request *request)
struct drm_i915_gem_request *cursor;
int num_elements = 0;
  
-	if (request->ctx != ring->default_context)

-   intel_lr_context_pin(request);
-
i915_gem_request_reference(request);
  
  	spin_lock_irq(>execlist_lock);

@@ -732,6 +729,13 @@ intel_logical_ring_advance_and_submit(struct 
drm_i915_gem_request *request)
if (intel_ring_stopped(ring))
return;
  
+	if (request->ctx != ring->default_context) {

+   if (!request->ctx->engine[ring->id].dirty) {
+   intel_lr_context_pin(request);
+   request->ctx->engine[ring->id].dirty = true;
+   }
+   }
+
if (dev_priv->guc.execbuf_client)
i915_guc_submit(dev_priv->guc.execbuf_client, request);
else
@@ -958,12 +962,6 @@ void intel_execlists_retire_requests(struct 
intel_engine_cs *ring)
spin_unlock_irq(>execlist_lock);
  
  	list_for_each_entry_safe(req, tmp, _list, execlist_link) {

-   struct intel_context *ctx = req->ctx;
-   struct drm_i915_gem_object *ctx_obj =
-   ctx->engine[ring->id].state;
-
-   if (ctx_obj && (ctx != ring->default_context))
-   intel_lr_context_unpin(req);
list_del(>execlist_link);
i915_gem_request_unreference(req);
}
@@ -1058,21 +1056,39 @@ reset_pin_count:
return ret;
  }
  
-void 

Re: [Intel-gfx] [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock

2015-11-24 Thread Yu Dai



On 11/24/2015 11:13 AM, Daniel Vetter wrote:

On Tue, Nov 24, 2015 at 10:36:54AM -0800, Yu Dai wrote:
>
>
> On 11/24/2015 10:08 AM, Daniel Vetter wrote:
> >On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> >> On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> >> >
> >> > On 11/24/2015 05:26 AM, Imre Deak wrote:
> >> > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> >> > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu@intel.com wrote:
> >> > > > > From: Alex Dai <yu@intel.com>
> >> > > > >
> >> > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> >> > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> >> > > > > /
> >> > > > > unlock. Otherwise, lockup happens because jiffies won't be
> >> > > > > updated
> >> > > > > dur to irq is disabled.
> >> > > > >
> >> > > > > Issue is found in igt/gem_close_race.
> >> > > > >
> >> > > > > Signed-off-by: Alex Dai <yu@intel.com>
> >> > > > > ---
> >> > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> >> > > > > +-
> >> > > > > -
> >> > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> >> > > > >
> >> > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > index 0a6b007..1418397 100644
> >> > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> >> > > > > i915_guc_client *gc)
> >> > > > >   union guc_doorbell_qw *db;
> >> > > > >   void *base;
> >> > > > >   int attempt = 2, ret = -EAGAIN;
> >> > > > > + unsigned long flags;
> >> > > > >
> >> > > > >   base = kmap_atomic(i915_gem_object_get_page(gc-
> >> > > > > > client_obj, 0));
> >> > > >
> >> > > > We don't need kmap_atomic anymore here now, since it's outside of
> >> > > > the
> >> > > > spinlock.
> >> > > >
> >> > > > >   desc = base + gc->proc_desc_offset;
> >> > > > >
> >> > > > > + spin_lock_irqsave(>wq_lock, flags);
> >> > > >
> >> > > > Please don't use the super-generic _irqsave. It's expensive and
> >> > > > results in
> >> > > > fragile code when someone accidentally reuses something in an
> >> > > > interrupt
> >> > > > handler that was never meant to run in that context.
> >> > > >
> >> > > > Instead please use the most specific funtion:
> >> > > > - spin_lock if you know you are in irq context.
> >> > > > - sipn_lock_irq if you know you are not.
> >> > >
> >> > > Right, and simply spin_lock() if the lock is not taken in IRQ
> >> > > context
> >> > > ever.
> >> >
> >> > This is not in IRQ context. So I will use spin_lock_irq instead.
> >>
> >> You can just use spin_lock(). spin_lock_irq() makes only sense if you
> >> take the lock in IRQ context too, which is not the case.
> >
> >Imo just drop both spinlocks, adding locks for debugfs is overkill imo.
> >
> How about using mutex_lock_interruptible(>struct_mutex) instead in
> debugfs, which is to replace host2guc lock.

Yes.

> spinlock during ring the door bell is still needed.

Where/why is that needed? At least on a quick look I didn't notice
anything.



Currently there is only one guc client to do the commands submission. It 
appears we don't need the lock. When there are more clients and all 
write to the scratch registers or ring the door bell, we don't want them 
interact with each other. Also, if we implement guc to host interrupt 
(says to handle the log buffer full event), we do need to protect the 
guc client content. Well, none presents today. I can clean up these and 
test out.


Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Change context lifecycle

2015-11-24 Thread Yu Dai



On 11/24/2015 08:23 AM, Nick Hoath wrote:

Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been saved.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.
This is to solve a hang with GuC submission, and a theoretical
issue with execlist submission.

v2: Moved the new pin to cover GuC submission (Alex Dai)
 Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
 context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request

Signed-off-by: Nick Hoath 
Issue: VIZ-4277
Cc: Daniel Vetter 
Cc: David Gordon 
Cc: Chris Wilson 
Cc: Alex Dai 
---
  drivers/gpu/drm/i915/i915_drv.h  |  1 +
  drivers/gpu/drm/i915/i915_gem.c  |  9 -
  drivers/gpu/drm/i915/intel_lrc.c | 73 ++--
  drivers/gpu/drm/i915/intel_lrc.h |  1 +
  4 files changed, 65 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d5cf30b..4d2f44c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -889,6 +889,7 @@ struct intel_context {
struct {
struct drm_i915_gem_object *state;
struct intel_ringbuffer *ringbuf;
+   bool unsaved;
int pin_count;
} engine[I915_NUM_RINGS];
  
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c

index e955499..6fee473 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1354,6 +1354,14 @@ static void i915_gem_request_retire(struct 
drm_i915_gem_request *request)
  {
trace_i915_gem_request_retire(request);
  
+	if (i915.enable_execlists) {

+   unsigned long flags;
+
+   spin_lock_irqsave(>ring->execlist_lock, flags);
+   intel_lr_context_complete_check(request);
+   spin_unlock_irqrestore(>ring->execlist_lock, flags);
+   }
+
/* We know the GPU must have read the request to have
 * sent us the seqno + interrupt, so use the position
 * of tail of the request to update the last known position
@@ -1384,7 +1392,6 @@ __i915_gem_request_retire__upto(struct 
drm_i915_gem_request *req)
do {
tmp = list_first_entry(>request_list,
   typeof(*tmp), list);
-
i915_gem_request_retire(tmp);
} while (tmp != req);
  
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c

index 06180dc..a527c21 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -566,9 +566,6 @@ static int execlists_context_queue(struct 
drm_i915_gem_request *request)
struct drm_i915_gem_request *cursor;
int num_elements = 0;
  
-	if (request->ctx != ring->default_context)

-   intel_lr_context_pin(request);
-
i915_gem_request_reference(request);
  
  	spin_lock_irq(>execlist_lock);

@@ -728,10 +725,16 @@ intel_logical_ring_advance_and_submit(struct 
drm_i915_gem_request *request)
intel_logical_ring_advance(request->ringbuf);
  
  	request->tail = request->ringbuf->tail;

-
if (intel_ring_stopped(ring))
return;
  
+	if (request->ctx != ring->default_context) {

+   if (!request->ctx->engine[ring->id].unsaved) {
+   intel_lr_context_pin(request);
+   request->ctx->engine[ring->id].unsaved = true;
+   }
+   }
+
if (dev_priv->guc.execbuf_client)
i915_guc_submit(dev_priv->guc.execbuf_client, request);
else
@@ -958,12 +961,6 @@ void intel_execlists_retire_requests(struct 
intel_engine_cs *ring)
spin_unlock_irq(>execlist_lock);
  
  	list_for_each_entry_safe(req, tmp, _list, execlist_link) {

-   struct intel_context *ctx = req->ctx;
-   struct drm_i915_gem_object *ctx_obj =
-   ctx->engine[ring->id].state;
-
-   if (ctx_obj && (ctx != ring->default_context))
-   intel_lr_context_unpin(req);
list_del(>execlist_link);
i915_gem_request_unreference(req);
}
@@ -1058,21 +1055,41 @@ reset_pin_count:
return ret;
  }
  
-void intel_lr_context_unpin(struct drm_i915_gem_request *rq)

+static void intel_lr_context_unpin_no_req(struct intel_engine_cs *ring,
+   struct intel_context *ctx)
  {
-   struct intel_engine_cs *ring = rq->ring;
-   struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-   struct intel_ringbuffer *ringbuf = 

Re: [Intel-gfx] [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock

2015-11-24 Thread Yu Dai



On 11/24/2015 10:08 AM, Daniel Vetter wrote:

On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> >
> > On 11/24/2015 05:26 AM, Imre Deak wrote:
> > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu@intel.com wrote:
> > > > > From: Alex Dai <yu@intel.com>
> > > > >
> > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> > > > > /
> > > > > unlock. Otherwise, lockup happens because jiffies won't be
> > > > > updated
> > > > > dur to irq is disabled.
> > > > >
> > > > > Issue is found in igt/gem_close_race.
> > > > >
> > > > > Signed-off-by: Alex Dai <yu@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> > > > > +-
> > > > > -
> > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > index 0a6b007..1418397 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > > > > i915_guc_client *gc)
> > > > >union guc_doorbell_qw *db;
> > > > >void *base;
> > > > >int attempt = 2, ret = -EAGAIN;
> > > > > +  unsigned long flags;
> > > > >
> > > > >base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > > client_obj, 0));
> > > >
> > > > We don't need kmap_atomic anymore here now, since it's outside of
> > > > the
> > > > spinlock.
> > > >
> > > > >desc = base + gc->proc_desc_offset;
> > > > >
> > > > > +  spin_lock_irqsave(>wq_lock, flags);
> > > >
> > > > Please don't use the super-generic _irqsave. It's expensive and
> > > > results in
> > > > fragile code when someone accidentally reuses something in an
> > > > interrupt
> > > > handler that was never meant to run in that context.
> > > >
> > > > Instead please use the most specific funtion:
> > > > - spin_lock if you know you are in irq context.
> > > > - sipn_lock_irq if you know you are not.
> > >
> > > Right, and simply spin_lock() if the lock is not taken in IRQ
> > > context
> > > ever.
> >
> > This is not in IRQ context. So I will use spin_lock_irq instead.
>
> You can just use spin_lock(). spin_lock_irq() makes only sense if you
> take the lock in IRQ context too, which is not the case.

Imo just drop both spinlocks, adding locks for debugfs is overkill imo.

How about using mutex_lock_interruptible(>struct_mutex) instead in 
debugfs, which is to replace host2guc lock.


spinlock during ring the door bell is still needed.

Alex

>
> > > > - spin_lock_irqsave should be a big warning sign that your code
> > > > has
> > > >   layering issues.
> > > >
> > > > Please audit the entire guc code for the above two issues.
> > >
> > > Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
> > > debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither
> > > of
> > > them are called from IRQ context AFAICS, in which case a simple
> > > spin_lock() would do.
> > >
> > > --Imre
> > >
> > > > > +
> > > > >/* Update the tail so it is visible to GuC */
> > > > >desc->tail = gc->wq_tail;
> > > > >
> > > > > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > > > > i915_guc_client *gc)
> > > > >db_exc.cookie = 1;
> > > > >}
> > > > >
> > > > > +  spin_unlock_irqrestore(>wq_lock, flags);
> > > > > +
> > > > >kunmap_atomic(base);
> > > > > +
> > > > >return ret;
> > > > >  }
> > > > 

Re: [Intel-gfx] [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock

2015-11-24 Thread Yu Dai



On 11/24/2015 05:26 AM, Imre Deak wrote:

On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu@intel.com wrote:
> > From: Alex Dai 
> >
> > When GuC Work Queue is full, driver will wait GuC for avaliable
> > space by delaying 1ms. The wait needs to be out of spinlockirq /
> > unlock. Otherwise, lockup happens because jiffies won't be updated
> > dur to irq is disabled.
> >
> > Issue is found in igt/gem_close_race.
> >
> > Signed-off-by: Alex Dai 
> > ---
> >  drivers/gpu/drm/i915/i915_guc_submission.c | 27 +-
> > -
> >  1 file changed, 17 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index 0a6b007..1418397 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > i915_guc_client *gc)
> >   union guc_doorbell_qw *db;
> >   void *base;
> >   int attempt = 2, ret = -EAGAIN;
> > + unsigned long flags;
> >
> >   base = kmap_atomic(i915_gem_object_get_page(gc-
> > >client_obj, 0));
>
> We don't need kmap_atomic anymore here now, since it's outside of the
> spinlock.
>
> >   desc = base + gc->proc_desc_offset;
> >
> > + spin_lock_irqsave(>wq_lock, flags);
>
> Please don't use the super-generic _irqsave. It's expensive and
> results in
> fragile code when someone accidentally reuses something in an
> interrupt
> handler that was never meant to run in that context.
>
> Instead please use the most specific funtion:
> - spin_lock if you know you are in irq context.
> - sipn_lock_irq if you know you are not.

Right, and simply spin_lock() if the lock is not taken in IRQ context
ever.


This is not in IRQ context. So I will use spin_lock_irq instead.

> - spin_lock_irqsave should be a big warning sign that your code has
>   layering issues.
>
> Please audit the entire guc code for the above two issues.

Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither of
them are called from IRQ context AFAICS, in which case a simple
spin_lock() would do.

--Imre

> > +
> >   /* Update the tail so it is visible to GuC */
> >   desc->tail = gc->wq_tail;
> >
> > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > i915_guc_client *gc)
> >   db_exc.cookie = 1;
> >   }
> >
> > + spin_unlock_irqrestore(>wq_lock, flags);
> > +
> >   kunmap_atomic(base);
> > +
> >   return ret;
> >  }
> >
> > @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct
> > i915_guc_client *gc, u32 *offset)
> >   struct guc_process_desc *desc;
> >   void *base;
> >   u32 size = sizeof(struct guc_wq_item);
> > - int ret = 0, timeout_counter = 200;
> > + int ret = -ETIMEDOUT, timeout_counter = 200;
> > + unsigned long flags;
> >
> >   base = kmap_atomic(i915_gem_object_get_page(gc-
> > >client_obj, 0));
> >   desc = base + gc->proc_desc_offset;
> >
> >   while (timeout_counter-- > 0) {
> > - ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail,
> > desc->head,
> > - gc->wq_size) >= size, 1);
> > + spin_lock_irqsave(>wq_lock, flags);
> >
> > - if (!ret) {
> > + if (CIRC_SPACE(gc->wq_tail, desc->head, gc-
> > >wq_size) >= size) {
> >   *offset = gc->wq_tail;
> >
> >   /* advance the tail for next workqueue
> > item */
> > @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct
> > i915_guc_client *gc, u32 *offset)
> >
> >   /* this will break the loop */
> >   timeout_counter = 0;
> > + ret = 0;
> >   }
> > +
> > + spin_unlock_irqrestore(>wq_lock, flags);
> > +
> > + if (timeout_counter)
> > + usleep_range(1000, 2000);
>
> Do we really not have a interrupt/signal from the guc when it has
> cleared
> up some space?
>


This is not implemented in fw although I think it could be done through 
the guc to host interrupt. I am worry about that if we implement this, 
it will end up with driver handles too many interrupts (maybe same 
amount of context switch). However, ideally we don't want to handle 
interrupts at all.

> >   };
> >
> >   kunmap_atomic(base);
> > @@ -597,19 +609,17 @@ int i915_guc_submit(struct i915_guc_client
> > *client,
> >  {
> >   struct intel_guc *guc = client->guc;
> >   enum intel_ring_id ring_id = rq->ring->id;
> > - unsigned long flags;
> >   int q_ret, b_ret;
> >
> >   /* Need this because of the deferred pin ctx and ring */
> >   /* Shall we move this right after ring is pinned? */
> >   lr_context_update(rq);
> >
> > - 

[Intel-gfx] [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock

2015-11-23 Thread yu . dai
From: Alex Dai 

When GuC Work Queue is full, driver will wait GuC for avaliable
space by delaying 1ms. The wait needs to be out of spinlockirq /
unlock. Otherwise, lockup happens because jiffies won't be updated
dur to irq is disabled.

Issue is found in igt/gem_close_race.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0a6b007..1418397 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
union guc_doorbell_qw *db;
void *base;
int attempt = 2, ret = -EAGAIN;
+   unsigned long flags;
 
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
+   spin_lock_irqsave(>wq_lock, flags);
+
/* Update the tail so it is visible to GuC */
desc->tail = gc->wq_tail;
 
@@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
db_exc.cookie = 1;
}
 
+   spin_unlock_irqrestore(>wq_lock, flags);
+
kunmap_atomic(base);
+
return ret;
 }
 
@@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct i915_guc_client 
*gc, u32 *offset)
struct guc_process_desc *desc;
void *base;
u32 size = sizeof(struct guc_wq_item);
-   int ret = 0, timeout_counter = 200;
+   int ret = -ETIMEDOUT, timeout_counter = 200;
+   unsigned long flags;
 
base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
desc = base + gc->proc_desc_offset;
 
while (timeout_counter-- > 0) {
-   ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-   gc->wq_size) >= size, 1);
+   spin_lock_irqsave(>wq_lock, flags);
 
-   if (!ret) {
+   if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
*offset = gc->wq_tail;
 
/* advance the tail for next workqueue item */
@@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct i915_guc_client 
*gc, u32 *offset)
 
/* this will break the loop */
timeout_counter = 0;
+   ret = 0;
}
+
+   spin_unlock_irqrestore(>wq_lock, flags);
+
+   if (timeout_counter)
+   usleep_range(1000, 2000);
};
 
kunmap_atomic(base);
@@ -597,19 +609,17 @@ int i915_guc_submit(struct i915_guc_client *client,
 {
struct intel_guc *guc = client->guc;
enum intel_ring_id ring_id = rq->ring->id;
-   unsigned long flags;
int q_ret, b_ret;
 
/* Need this because of the deferred pin ctx and ring */
/* Shall we move this right after ring is pinned? */
lr_context_update(rq);
 
-   spin_lock_irqsave(>wq_lock, flags);
-
q_ret = guc_add_workqueue_item(client, rq);
if (q_ret == 0)
b_ret = guc_ring_doorbell(client);
 
+   spin_lock(>host2guc_lock);
client->submissions[ring_id] += 1;
if (q_ret) {
client->q_fail += 1;
@@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client *client,
} else {
client->retcode = 0;
}
-   spin_unlock_irqrestore(>wq_lock, flags);
-
-   spin_lock(>host2guc_lock);
guc->submissions[ring_id] += 1;
guc->last_seqno[ring_id] = rq->seqno;
spin_unlock(>host2guc_lock);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1] drm/i915: Fix a false alert of memory leak when free LRC

2015-11-23 Thread Yu Dai



On 11/23/2015 02:34 AM, Tvrtko Ursulin wrote:

On 20/11/15 08:31, Daniel Vetter wrote:
> On Thu, Nov 19, 2015 at 04:10:26PM -0800, yu@intel.com wrote:
>> From: Alex Dai 
>>
>> There is a memory leak warning message from i915_gem_context_clean
>> when GuC submission is enabled. The reason is that when LRC is
>> released, its ppgtt could be still referenced. The assumption that
>> all VMAs are unbound during release of LRC is not true.
>>
>> v1: Move the code inside i915_gem_context_clean() to where ppgtt is
>> released because it is not cleaning context anyway but ppgtt.
>>
>> Signed-off-by: Alex Dai 
>
> retire__read drops the ctx (and hence ppgtt) reference too early,
> resulting in us hitting the WARNING. See the giant thread with Tvrtko,
> Chris and me:
>
> http://www.spinics.net/lists/intel-gfx/msg78918.html
>
> Would be great if someone could test the diff I posted in there.

It doesn't work - I have posted my IGT snippet which I thought explained it.


I thought moving the VMA list clean up into i915_ppgtt_release() should 
work. However, it creates a chicken & egg problem. ppgtt_release() rely 
on vma_unbound() to be called first to decrease its refcount. So calling 
vma_unbound() inside ppgtt_release() is not right.

Problem req unreference in obj->active case. When it hits that path it
will not move the VMA to inactive and the
intel_execlists_retire_requests will be the last unreference from the
retire worker which will trigger the WARN.


I still think the problem comes from the assumption that when lrc is 
released, its all VMAs should be unbound. Precisely I mean the comments 
made for i915_gem_context_clean() - "This context is going away and we 
need to remove all VMAs still around." Really the lrc life cycle is 
different from ppgtt / VMAs. Check the line after 
i915_gem_context_clean(). It is ppgtt_put(). In the case lrc is freed 
early, It won't release ppgtt anyway because it is still referenced by 
VMAs. An it will be freed when no ref of GEM obj.



I posted an IGT which hits that ->
http://patchwork.freedesktop.org/patch/65369/

And posted one give up on the active VMA mem leak patch ->
http://patchwork.freedesktop.org/patch/65529/


This patch will silent the warning. But I think the 
i915_gem_context_clean() itself is unnecessary. I don't see any issue by 
deleting it. The check of VMA list is inside ppgtt_release() and the 
unbound should be aligned to GEM obj's life cycle but not lrc life cycle.

I have no idea yet of GuC implications, I just spotted this parallel thread.

And Mika has proposed something interesting - that we could just clean
up the active VMA in context cleanup since we know it is a false one.

However, again I don't know how that interacts with the GuC. Surely it
cannot be freeing the context with stuff genuinely still active in the GuC?



There is no interacts with GuC though. Just very easy to see the warning 
when GuC is enabled, says when run gem_close_race. The reason is that 
GuC does not use the execlist_queue (execlist_retired_req_list) which is 
deferred to retire worker. Same as ring submission mode, when GuC is 
enabled, whenever driver submits a new batch, it will try to release 
previous request. I don't know why  intel_execlists_retire_requests is 
not called for this case. Probably because of the unpin. Deferring the 
retirement may just hide the issue. I bet you will see the warning more 
often if you change i915_gem_retire_requests_ring() to 
i915_gem_retire_requests() in i915_gem_execbuffer_reserve().


Thanks,
Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1] drm/i915: Fix a false alert of memory leak when free LRC

2015-11-20 Thread Yu Dai



On 11/20/2015 12:31 AM, Daniel Vetter wrote:

On Thu, Nov 19, 2015 at 04:10:26PM -0800, yu@intel.com wrote:
> From: Alex Dai 
>
> There is a memory leak warning message from i915_gem_context_clean
> when GuC submission is enabled. The reason is that when LRC is
> released, its ppgtt could be still referenced. The assumption that
> all VMAs are unbound during release of LRC is not true.
>
> v1: Move the code inside i915_gem_context_clean() to where ppgtt is
> released because it is not cleaning context anyway but ppgtt.
>
> Signed-off-by: Alex Dai 

retire__read drops the ctx (and hence ppgtt) reference too early,
resulting in us hitting the WARNING. See the giant thread with Tvrtko,
Chris and me:

http://www.spinics.net/lists/intel-gfx/msg78918.html

Would be great if someone could test the diff I posted in there.


I have to recall my patch because of calling vma_unbind inside 
ppgtt_release is not right.



> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 24 
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 12 
>  2 files changed, 12 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
> index 204dc7c..cc5c8e6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -133,23 +133,6 @@ static int get_context_size(struct drm_device *dev)
>return ret;
>  }
>
> -static void i915_gem_context_clean(struct intel_context *ctx)
> -{
> -  struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
> -  struct i915_vma *vma, *next;
> -
> -  if (!ppgtt)
> -  return;
> -
> -  WARN_ON(!list_empty(>base.active_list));
> -
> -  list_for_each_entry_safe(vma, next, >base.inactive_list,
> -   mm_list) {
> -  if (WARN_ON(__i915_vma_unbind_no_wait(vma)))
> -  break;
> -  }
> -}
> -
>  void i915_gem_context_free(struct kref *ctx_ref)
>  {
>struct intel_context *ctx = container_of(ctx_ref, typeof(*ctx), ref);
> @@ -159,13 +142,6 @@ void i915_gem_context_free(struct kref *ctx_ref)
>if (i915.enable_execlists)
>intel_lr_context_free(ctx);
>
> -  /*
> -   * This context is going away and we need to remove all VMAs still
> -   * around. This is to handle imported shared objects for which
> -   * destructor did not run when their handles were closed.
> -   */
> -  i915_gem_context_clean(ctx);
> -
>i915_ppgtt_put(ctx->ppgtt);
>
>if (ctx->legacy_hw_ctx.rcs_state)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 016739e..d36943c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2214,6 +2214,7 @@ void  i915_ppgtt_release(struct kref *kref)
>  {
>struct i915_hw_ppgtt *ppgtt =
>container_of(kref, struct i915_hw_ppgtt, ref);
> +  struct i915_vma *vma, *next;
>
>trace_i915_ppgtt_release(>base);
>
> @@ -2221,6 +,17 @@ void  i915_ppgtt_release(struct kref *kref)
>WARN_ON(!list_empty(>base.active_list));
>WARN_ON(!list_empty(>base.inactive_list));
>
> +  /*
> +   * This ppgtt is going away and we need to remove all VMAs still
> +   * around. This is to handle imported shared objects for which
> +   * destructor did not run when their handles were closed.
> +   */
> +  list_for_each_entry_safe(vma, next, >base.inactive_list,
> +   mm_list) {
> +  if (WARN_ON(__i915_vma_unbind_no_wait(vma)))
> +  break;
> +  }
> +
>list_del(>base.global_link);
>drm_mm_takedown(>base.mm);
>
> --
> 2.5.0
>



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1] drm/i915: Fix a false alert of memory leak when free LRC

2015-11-19 Thread yu . dai
From: Alex Dai 

There is a memory leak warning message from i915_gem_context_clean
when GuC submission is enabled. The reason is that when LRC is
released, its ppgtt could be still referenced. The assumption that
all VMAs are unbound during release of LRC is not true.

v1: Move the code inside i915_gem_context_clean() to where ppgtt is
released because it is not cleaning context anyway but ppgtt.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 24 
 drivers/gpu/drm/i915/i915_gem_gtt.c | 12 
 2 files changed, 12 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 204dc7c..cc5c8e6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,23 +133,6 @@ static int get_context_size(struct drm_device *dev)
return ret;
 }
 
-static void i915_gem_context_clean(struct intel_context *ctx)
-{
-   struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
-   struct i915_vma *vma, *next;
-
-   if (!ppgtt)
-   return;
-
-   WARN_ON(!list_empty(>base.active_list));
-
-   list_for_each_entry_safe(vma, next, >base.inactive_list,
-mm_list) {
-   if (WARN_ON(__i915_vma_unbind_no_wait(vma)))
-   break;
-   }
-}
-
 void i915_gem_context_free(struct kref *ctx_ref)
 {
struct intel_context *ctx = container_of(ctx_ref, typeof(*ctx), ref);
@@ -159,13 +142,6 @@ void i915_gem_context_free(struct kref *ctx_ref)
if (i915.enable_execlists)
intel_lr_context_free(ctx);
 
-   /*
-* This context is going away and we need to remove all VMAs still
-* around. This is to handle imported shared objects for which
-* destructor did not run when their handles were closed.
-*/
-   i915_gem_context_clean(ctx);
-
i915_ppgtt_put(ctx->ppgtt);
 
if (ctx->legacy_hw_ctx.rcs_state)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 016739e..d36943c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2214,6 +2214,7 @@ void  i915_ppgtt_release(struct kref *kref)
 {
struct i915_hw_ppgtt *ppgtt =
container_of(kref, struct i915_hw_ppgtt, ref);
+   struct i915_vma *vma, *next;
 
trace_i915_ppgtt_release(>base);
 
@@ -2221,6 +,17 @@ void  i915_ppgtt_release(struct kref *kref)
WARN_ON(!list_empty(>base.active_list));
WARN_ON(!list_empty(>base.inactive_list));
 
+   /*
+* This ppgtt is going away and we need to remove all VMAs still
+* around. This is to handle imported shared objects for which
+* destructor did not run when their handles were closed.
+*/
+   list_for_each_entry_safe(vma, next, >base.inactive_list,
+mm_list) {
+   if (WARN_ON(__i915_vma_unbind_no_wait(vma)))
+   break;
+   }
+
list_del(>base.global_link);
drm_mm_takedown(>base.mm);
 
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Keep irq enabled during GuC cmd submission

2015-11-19 Thread yu . dai
From: Alex Dai 

When GuC Work Queue is full, driver will wait GuC for avaliable
space by calling wait_for_atomic. If irq is disabled, lockup will
happen because jiffies won't be updated.

Issue is found in igt/gem_close_race.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0a6b007..bbfa6ed 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -597,14 +597,13 @@ int i915_guc_submit(struct i915_guc_client *client,
 {
struct intel_guc *guc = client->guc;
enum intel_ring_id ring_id = rq->ring->id;
-   unsigned long flags;
int q_ret, b_ret;
 
/* Need this because of the deferred pin ctx and ring */
/* Shall we move this right after ring is pinned? */
lr_context_update(rq);
 
-   spin_lock_irqsave(>wq_lock, flags);
+   spin_lock(>wq_lock);
 
q_ret = guc_add_workqueue_item(client, rq);
if (q_ret == 0)
@@ -620,7 +619,7 @@ int i915_guc_submit(struct i915_guc_client *client,
} else {
client->retcode = 0;
}
-   spin_unlock_irqrestore(>wq_lock, flags);
+   spin_unlock(>wq_lock);
 
spin_lock(>host2guc_lock);
guc->submissions[ring_id] += 1;
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1] drm/i915: Defer LRC unpin and release

2015-11-19 Thread yu . dai
From: Alex Dai 

Can't immediately free LRC (neither unpin it) even all its
referenced requests are completed, because HW still need a short
period of time to save data to LRC status page. It is safe to free
LRC when HW completes a request from a different LRC.

Introduce a new function intel_lr_context_do_unpin that do the
actual unpin work. When driver receives unpin call (from retiring
of a request), the LRC pin & ref count will be increased to defer
the unpin and release. If last LRC is different and its pincount
reaches to zero, driver will do the actual unpin work.

There will be always a LRC kept until ring itself gets cleaned up.

v1: Simplify the update of last context by reusing current ring->
last_context. Be note that it is safe to do so because lrc ring
is cleaned up early than i915_gem_context_fini().

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/intel_lrc.c | 59 
 1 file changed, 54 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 06180dc..7a3c9cc 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1039,6 +1039,55 @@ unpin_ctx_obj:
return ret;
 }
 
+static void intel_lr_context_do_unpin(struct intel_engine_cs *ring,
+   struct intel_context *ctx)
+{
+   struct drm_device *dev = ring->dev;
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct drm_i915_gem_object *ctx_obj;
+
+   WARN_ON(!mutex_is_locked(>dev->struct_mutex));
+
+   ctx_obj = ctx->engine[ring->id].state;
+   if (!ctx_obj)
+   return;
+
+   i915_gem_object_ggtt_unpin(ctx_obj);
+   intel_unpin_ringbuffer_obj(ctx->engine[ring->id].ringbuf);
+
+   /* Invalidate GuC TLB. */
+   if (i915.enable_guc_submission)
+   I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+}
+
+static void set_last_lrc(struct intel_engine_cs *ring,
+   struct intel_context *ctx)
+{
+   struct intel_context *last;
+
+   /* Unpin will be deferred, so the release of lrc. Hold pin & ref count
+* untill we receive the retire of next request. */
+   if (ctx) {
+   ctx->engine[ring->id].pin_count++;
+   i915_gem_context_reference(ctx);
+   }
+
+   last = ring->last_context;
+   ring->last_context = ctx;
+
+   if (last == NULL)
+   return;
+
+   /* Unpin is on hold for last context. Release pincount first. Then if HW
+* completes request from another lrc, try to do the actual unpin. */
+   last->engine[ring->id].pin_count--;
+   if (last != ctx && !last->engine[ring->id].pin_count)
+   intel_lr_context_do_unpin(ring, last);
+
+   /* Release previous context refcount that on hold */
+   i915_gem_context_unreference(last);
+}
+
 static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
 {
int ret = 0;
@@ -1062,14 +,11 @@ void intel_lr_context_unpin(struct drm_i915_gem_request 
*rq)
 {
struct intel_engine_cs *ring = rq->ring;
struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-   struct intel_ringbuffer *ringbuf = rq->ringbuf;
 
if (ctx_obj) {
WARN_ON(!mutex_is_locked(>dev->struct_mutex));
-   if (--rq->ctx->engine[ring->id].pin_count == 0) {
-   intel_unpin_ringbuffer_obj(ringbuf);
-   i915_gem_object_ggtt_unpin(ctx_obj);
-   }
+   --rq->ctx->engine[ring->id].pin_count;
+   set_last_lrc(ring, rq->ctx);
}
 }
 
@@ -1908,6 +1954,9 @@ void intel_logical_ring_cleanup(struct intel_engine_cs 
*ring)
}
 
lrc_destroy_wa_ctx_obj(ring);
+
+   /* this will clean up last lrc */
+   set_last_lrc(ring, NULL);
 }
 
 static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs 
*ring)
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted

2015-11-11 Thread Yu Dai



On 11/11/2015 01:07 AM, Chris Wilson wrote:

On Tue, Nov 10, 2015 at 03:27:36PM -0800, yu@intel.com wrote:
> From: Alex Dai 
>
> We keep a copy of GuC fw in a GEM obj. However its content is lost
> if the GEM obj is swapped (igt/gem_evict_*). Therefore, the later
> fw loading during GPU reset will fail. Mark the obj dirty after
> copying data into the pages. So its content will be kept during
> swapped out.
>
> Signed-off-by: Alex Dai 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f1e3fde..3b15167 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5199,6 +5199,7 @@ i915_gem_object_create_from_data(struct drm_device *dev,
>i915_gem_object_pin_pages(obj);
>sg = obj->pages;
>bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
> +  obj->dirty = 1;

That's one way of doing it, but atypical for our CPU access to the pages.
The root cause is still that sg_copy_from_buffer() isn't calling
set_page_dirty() and so there will be other places in the kernel that
fall foul of it. Also, not that we could have written this to not require
the whole object to be pinned at once as well.

I would prefer this to be fixed in sg_copy_from_buffer() for the reason
that all callers are susceptible to this bug.
-Chris

Makes sense. I will keep this as it for now. We want to set this GEM obj 
dirty too in addition of marking each page dirty. A change in 
sg_copy_buffer will have big impact on kernel though.


Thanks,
Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix a fw content lost issue after it is evicted

2015-11-10 Thread Yu Dai



On 11/09/2015 02:15 AM, Chris Wilson wrote:

On Fri, Nov 06, 2015 at 03:18:37PM -0800, Yu Dai wrote:
>
>
> On 11/06/2015 02:07 PM, Chris Wilson wrote:
> >On Fri, Nov 06, 2015 at 01:55:27PM -0800, yu@intel.com wrote:
> >> From: Alex Dai <yu@intel.com>
> >>
> >> We keep a copy of GuC fw in a GEM obj. However its content is lost
> >> if the GEM obj is evicted (igt/gem_evict_*). Therefore, the later
> >> fw loading during GPU reset will fail.
> >
> >No, it's not. The bug is in sg_copy_buffer called by
> >i915_gem_object_create_from_data introduced by yourselves.
> >
>
> My understanding is that sg_copy_from_buffer is used to copy data.
> Can you clarify why using this will cause such issue?

"However its content is lost if the GEM obj is evicted (igt/gem_evict_*)."

is not strictly true. The content is lost if the object is swapped
because the page is never marked as dirty. sg_copy_buffer() is copying
into the page but never marks said page as dirty.
-Chris


Thanks Chris. Now the patch is one line change. :)

Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted

2015-11-10 Thread yu . dai
From: Alex Dai 

We keep a copy of GuC fw in a GEM obj. However its content is lost
if the GEM obj is swapped (igt/gem_evict_*). Therefore, the later
fw loading during GPU reset will fail. Mark the obj dirty after
copying data into the pages. So its content will be kept during
swapped out.

Signed-off-by: Alex Dai 
---
 drivers/gpu/drm/i915/i915_gem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f1e3fde..3b15167 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5199,6 +5199,7 @@ i915_gem_object_create_from_data(struct drm_device *dev,
i915_gem_object_pin_pages(obj);
sg = obj->pages;
bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
+   obj->dirty = 1;
i915_gem_object_unpin_pages(obj);
 
if (WARN_ON(bytes != size)) {
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


  1   2   3   >