On ke, 2017-02-15 at 18:28 -0800, Daniele Ceraolo Spurio wrote:
> 
> On 14/02/17 05:53, Joonas Lahtinen wrote:
> > 
> > Started adding proper teardown to guc_client_alloc, ended up removing
> > quite a few dead ends where errors communicating with the GuC were
> > silently ignored. There also seemed to be quite a few erronous
> > teardown actions performed in case of an error (ordering wrong).
> > 
> > v2:
> >     - Increase function symmetry/proximity (Michal/Daniele)
> >     - Fix __reserve_doorbell accounting for high priority (Daniele)
> >     - Call __update_doorbell_desc! (Daniele)
> >     - Isolate __guc_{,de}allocate_doorbell (Michal/Daniele)
> > 
> > Signed-off-by: Joonas Lahtinen <joonas.lahti...@linux.intel.com>
> > Cc: Michal Wajdeczko <michal.wajdec...@intel.com>
> > Cc: Arkadiusz Hiler <arkadiusz.hi...@intel.com>
> > Cc: Oscar Mateo <oscar.ma...@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
> > Cc: Chris Wilson <ch...@chris-wilson.co.uk>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospu...@intel.com>

<SNIP>

> > +static bool has_doorbell(struct i915_guc_client *client)
> > +{
> > +   if (client->doorbell_id == GUC_DOORBELL_INVALID)
> > +           return false;
> > 
> > -   /* Activate the new doorbell */
> > -   __set_bit(new_id, doorbell_bitmap);
> > +   return test_bit(client->doorbell_id, client->guc->doorbell_bitmap);
> 
> Should we warn/gem_bug if client->doorbell_id != GUC_DOORBELL_INVALID 
> and the bit in guc->doorbell_bitmap is not set? Not sure if you plan to 
> decouple them more in the future, but with the current code it should 
> always be impossible to have a valid doorbell without the bit in the 
> bitmask being set. Maybe we can just return client->doorbell_id != 
> GUC_DOORBELL_INVALID here if we have no plan for a case where they can 
> be out of sync.

Kinda a leftover from restructuring, the selection and reservation were
a different thing at some point. Changed into GEM_BUG_ON.

> > +static int __destroy_doorbell(struct i915_guc_client *client)
> >  {
> > -   (void)guc_update_doorbell_id(guc, client, GUC_INVALID_DOORBELL_ID);
> > +   struct guc_doorbell_info *doorbell;
> > 
> > -   /* XXX: wait for any interrupts */
> > -   /* XXX: wait for workqueue to drain */
> > +   doorbell = __get_doorbell(client);
> > +   doorbell->db_status = GUC_DOORBELL_DISABLED;
> > +   doorbell->cookie = 0;
> > +
> 
> Not from this patch (but since we're at it...), I did a bit of digging 
> and I've found that doorbell release flow requires SW to poll the 
> GEN8_DRBREGL(db_id) register after updating doorbell->db_status to wait 
> for the GEN8_DRB_VALID bit to go to zero. This ensures that any pending 
> events are processed before we call into GuC to release the doorbell. 
> Note that GuC will fail the DEALLOCATE_DOORBELL action if the bit has 
> not gone to zero yet. This is currently not an issue, probably because 
> we use a single doorbell and we don't ring it near release time, but the 
> situation could change in the future so I believe it is worth to fix it 
> now. I can follow up with an additional patch if you want to keep this 
> one as refactoring only.

Ack, add a follow-up on top of your series.

> > -static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
> > +   return 0;
> > +}
> > +
> > +static unsigned long __reserve_cacheline(struct intel_guc* guc)
> 
> "reserve_cacheline" doesn't really reflect what happens, because more 
> doorbells can use the same cacheline (while they are on different 
> physical pages) and there is no concept of unreserving the cacheline. 
> The idea is to try to avoid having lots of doorbells on the same 
> cacheline if possible to make the monitoring more efficient on the HW, 
> so I'd keep the "select_cacheline" naming for this function. We should 
> probably also look at modifying the function to use something more 
> elaborated than a simple round robin scheme to ensure doorbells are 
> equally distribuited on cachelines, but that can probably wait until we 
> use more doorbells.

Renamed.


> >  /*
> >   * Borrow the first client to set up & tear down each unused doorbell
> >   * in turn, to ensure that all doorbell h/w is (re)initialised.
> >   */
> > -static void guc_init_doorbell_hw(struct intel_guc *guc)
> > +static int guc_init_doorbell_hw(struct intel_guc *guc)
> >  {
> >     struct i915_guc_client *client = guc->execbuf_client;
> > -   uint16_t db_id;
> > -   int i, err;
> > +   int err;
> > +   int i;
> > 
> > -   guc_disable_doorbell(guc, client);
> > +   if (has_doorbell(client))
> > +           destroy_doorbell(client);
> > 
> > -   for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
> > -           /* Skip if doorbell is OK */
> > -           if (guc_doorbell_check(guc, i))
> > +   for (i = 0; i < GUC_NUM_DOORBELLS; ++i) {
> > +           if (doorbell_ok(guc, i))
> >                     continue;
> > 
> 
> As mentioned in the previous patch version, here we assume that all the 
> doorbells should be disabled an we want to reset HW that has been left 
> enabled from previous users, so the doorbell_bitmap should be clear. 
> Maybe adding a simple
> 
>       GEM_BUG_ON(test_bit(i, guc->doorbell_bitmap));
> 
> will help making sure that we're not leaving bits set when 
> releasing/resetting.

The functions don't actually even touch doorbell_bitmap, they just nuke
all active clients. That's what the previous code did, as I read it.


> > -           err = guc_update_doorbell_id(guc, client, i);
> > -           if (err)
> > -                   DRM_DEBUG_DRIVER("Doorbell %d update failed, err %d\n",
> > -                                   i, err);
> > +           err = __reset_doorbell(client, i);
> 
> If the reset fails the doorbell is in a bad state, so it might be worth 
> to ensure that the bit is set in the bitmask to make sure we don't 
> assign that doorbell to any client, but we'd have to make sure to clear 
> the bitmask on GuC reset (see comment above).

Clearing the bits now in guc_init_doorbell_hw().


> > +           WARN(err, "Doorbell %d reset failed, err %d\n", i, err);
> >     }
> > 
> > -   db_id = select_doorbell_register(guc, client->priority);
> > -   WARN_ON(db_id == GUC_INVALID_DOORBELL_ID);
> > -
> > -   err = guc_update_doorbell_id(guc, client, db_id);
> > +   err = __reserve_doorbell(client);
> >     if (err)
> > -           DRM_WARN("Failed to restore doorbell to %d, err %d\n",
> > -                    db_id, err);
> > +           return err;
> 
> Continuing the discussion from the previous patch revision:
> 
> <snip>
>  > Shouldn't this be create_doorbell() instead of reserve?
> 
> That would end up calling the __update_doorbell, which we can't as the
> desc is for some strange reason not mapped yet.
> </snip>
> 
> As far as I can see, everything should already be allocated and mapped 
> here. Aren't we anyway already calling __update_doorbell both in the 
> client_alloc and in __reset_doorbell above?
> Also, where are we re-creating the doorbell that we destroyed at the 
> beginning of the function?
> 

Hmm, refactored the extra destroy cycle out.

<SNIP>

> > @@ -753,27 +781,35 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
> >     guc_proc_desc_init(guc, client);
> >     guc_ctx_desc_init(guc, client);
> > 
> > -   /* For runtime client allocation we need to enable the doorbell. Not
> > -    * required yet for the static execbuf_client as this special kernel
> > -    * client is enabled from i915_guc_submission_enable().
> > -    *
> > -    * guc_update_doorbell_id(guc, client, db_id);
> > -    */
> > +   /* For runtime client allocation we need to enable the doorbell. */
> 
> this comment is a bit unclear now because you're not deferring the 
> initialization of execbuf_client to guc_init_doorbell_hw anymore, so 
> there is no difference between execbuf_client and "runtime" clients. 
> Maybe we can just remove the comment.

Removed.

> Note that creating the doorbell here will cause it to be destroyed and 
> re-allocated when i915_guc_submission_enable is called. A worth 
> compromise IMO to avoid special cases, but worth pointing out :)

Refactored a bit.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to