Re: [Intel-gfx] [PATCH 4/4] drm/i915: Late request cancellations are harmful
On Mon, Apr 11, 2016 at 02:50:17PM +0100, Tvrtko Ursulin wrote: > > On 09/04/16 10:27, Chris Wilson wrote: > >Conceptually, each request is a record of a hardware transaction - we > >build up a list of pending commands and then either commit them to > >hardware, or cancel them. However, whilst building up the list of > >pending commands, we may modify state outside of the request and make > >references to the pending request. If we do so and then cancel that > >request, external objects then point to the deleted request leading to > >both graphical and memory corruption. > > > >The easiest example is to consider object/VMA tracking. When we mark an > >object as active in a request, we store a pointer to this, the most > >recent request, in the object. Then we want to free that object, we wait > >for the most recent request to be idle before proceeding (otherwise the > >hardware will write to pages now owned by the system, or we will attempt > >to read from those pages before the hardware is finished writing). If > >the request was cancelled instead, that wait completes immediately. As a > >result, all requests must be committed and not cancelled if the external > >state is unknown. > > This was a bit hard to figure out. > > So we cannot unwind because once we set last_read_req we lose the > data on what was the previous one, before this transaction started? > > Intuitively I don't like the idea of sending unfinished stuff to the > GPU, when it failed at some random point in ring buffer preparation. Yes, but it is not unfinished though. The mm-switch is made and flagged as completed, etc. The request does contain instructions for the state changes it has made so far. > So I am struggling with reviewing this as I have in the previous round. > > >All that remains of i915_gem_request_cancel() users are just a couple of > >extremely unlikely allocation failures, so remove the API entirely. > > This parts feels extra weird because in the non-execbuf cases we > actually can cancel the transaction without any issues, correct? Nope. Same problem arises in that they do not know what happens underneath calls to e.g. object_sync. Before cancellation you have to inspect every path now and in the future to be sure that no state was modified inside the request. > Would middle-ground be to keep the cancellations for in-kernel > submits, and for execbuf rewind the ringbuf so only request > post-amble is sent to the GPU? The only place that knows whether an external observer is the request code. I am thinking about doing the cancellation there, I just need to check that the lockless idling will be ok with that. > >@@ -3410,12 +3404,9 @@ int i915_gpu_idle(struct drm_device *dev) > > return PTR_ERR(req); > > > > ret = i915_switch_context(req); > >-if (ret) { > >-i915_gem_request_cancel(req); > >-return ret; > >-} > >- > > i915_add_request_no_flush(req); > >+if (ret) > >+return ret; > > Looks like with this it could execute the context switch on the GPU > but not update the engine->last_context in do_switch(). Hmm, that problem is present in the current code - requests are not unwound, just deleted. We need to rearrange switch_context after the mi_set_context() to ensure that on the subsequent call, the context is reloaded. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/4] drm/i915: Late request cancellations are harmful
On 09/04/16 10:27, Chris Wilson wrote: Conceptually, each request is a record of a hardware transaction - we build up a list of pending commands and then either commit them to hardware, or cancel them. However, whilst building up the list of pending commands, we may modify state outside of the request and make references to the pending request. If we do so and then cancel that request, external objects then point to the deleted request leading to both graphical and memory corruption. The easiest example is to consider object/VMA tracking. When we mark an object as active in a request, we store a pointer to this, the most recent request, in the object. Then we want to free that object, we wait for the most recent request to be idle before proceeding (otherwise the hardware will write to pages now owned by the system, or we will attempt to read from those pages before the hardware is finished writing). If the request was cancelled instead, that wait completes immediately. As a result, all requests must be committed and not cancelled if the external state is unknown. This was a bit hard to figure out. So we cannot unwind because once we set last_read_req we lose the data on what was the previous one, before this transaction started? Intuitively I don't like the idea of sending unfinished stuff to the GPU, when it failed at some random point in ring buffer preparation. So I am struggling with reviewing this as I have in the previous round. All that remains of i915_gem_request_cancel() users are just a couple of extremely unlikely allocation failures, so remove the API entirely. This parts feels extra weird because in the non-execbuf cases we actually can cancel the transaction without any issues, correct? Would middle-ground be to keep the cancellations for in-kernel submits, and for execbuf rewind the ringbuf so only request post-amble is sent to the GPU? A consequence of committing all incomplete requests is that we generate excess breadcrumbs and fill the ring much more often with dummy work. We have completely undone the outstanding_last_seqno optimisation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907 Signed-off-by: Chris Wilson Cc: Daniel Vetter Cc: Tvrtko Ursulin Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_drv.h| 2 -- drivers/gpu/drm/i915/i915_gem.c| 50 -- drivers/gpu/drm/i915/i915_gem_context.c| 21 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++-- drivers/gpu/drm/i915/intel_display.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 4 +-- drivers/gpu/drm/i915/intel_overlay.c | 8 ++--- 7 files changed, 39 insertions(+), 63 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a93e5dd4fa9a..f374db8de673 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2320,7 +2320,6 @@ struct drm_i915_gem_request { struct drm_i915_gem_request * __must_check i915_gem_request_alloc(struct intel_engine_cs *engine, struct intel_context *ctx); -void i915_gem_request_cancel(struct drm_i915_gem_request *req); void i915_gem_request_free(struct kref *req_ref); int i915_gem_request_add_to_client(struct drm_i915_gem_request *req, struct drm_file *file); @@ -2872,7 +2871,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); void i915_gem_execbuffer_move_to_active(struct list_head *vmas, struct drm_i915_gem_request *req); -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params); int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1c3ff56594d6..42227495803f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2753,7 +2753,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine, * fully prepared. Thus it can be cleaned up using the proper * free code. */ - i915_gem_request_cancel(req); + intel_ring_reserved_space_cancel(req->ringbuf); + i915_gem_request_unreference(req); return ret; } @@ -2790,13 +2791,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, return err ? ERR_PTR(err) : req; } -void i915_gem_request_cancel(struct drm_i915_gem_request *req) -{ - intel_ring_reserved_space_cancel(req->ringbuf); - - i915_gem_request_unreference(req); -} - struct drm_i915_gem_request * i915_gem_find_active_request(struct intel_engine_cs *engine) { @@ -3410,12 +3404,9 @
[Intel-gfx] [PATCH 4/4] drm/i915: Late request cancellations are harmful
Conceptually, each request is a record of a hardware transaction - we build up a list of pending commands and then either commit them to hardware, or cancel them. However, whilst building up the list of pending commands, we may modify state outside of the request and make references to the pending request. If we do so and then cancel that request, external objects then point to the deleted request leading to both graphical and memory corruption. The easiest example is to consider object/VMA tracking. When we mark an object as active in a request, we store a pointer to this, the most recent request, in the object. Then we want to free that object, we wait for the most recent request to be idle before proceeding (otherwise the hardware will write to pages now owned by the system, or we will attempt to read from those pages before the hardware is finished writing). If the request was cancelled instead, that wait completes immediately. As a result, all requests must be committed and not cancelled if the external state is unknown. All that remains of i915_gem_request_cancel() users are just a couple of extremely unlikely allocation failures, so remove the API entirely. A consequence of committing all incomplete requests is that we generate excess breadcrumbs and fill the ring much more often with dummy work. We have completely undone the outstanding_last_seqno optimisation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907 Signed-off-by: Chris Wilson Cc: Daniel Vetter Cc: Tvrtko Ursulin Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_drv.h| 2 -- drivers/gpu/drm/i915/i915_gem.c| 50 -- drivers/gpu/drm/i915/i915_gem_context.c| 21 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++-- drivers/gpu/drm/i915/intel_display.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 4 +-- drivers/gpu/drm/i915/intel_overlay.c | 8 ++--- 7 files changed, 39 insertions(+), 63 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a93e5dd4fa9a..f374db8de673 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2320,7 +2320,6 @@ struct drm_i915_gem_request { struct drm_i915_gem_request * __must_check i915_gem_request_alloc(struct intel_engine_cs *engine, struct intel_context *ctx); -void i915_gem_request_cancel(struct drm_i915_gem_request *req); void i915_gem_request_free(struct kref *req_ref); int i915_gem_request_add_to_client(struct drm_i915_gem_request *req, struct drm_file *file); @@ -2872,7 +2871,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); void i915_gem_execbuffer_move_to_active(struct list_head *vmas, struct drm_i915_gem_request *req); -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params); int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1c3ff56594d6..42227495803f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2753,7 +2753,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine, * fully prepared. Thus it can be cleaned up using the proper * free code. */ - i915_gem_request_cancel(req); + intel_ring_reserved_space_cancel(req->ringbuf); + i915_gem_request_unreference(req); return ret; } @@ -2790,13 +2791,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, return err ? ERR_PTR(err) : req; } -void i915_gem_request_cancel(struct drm_i915_gem_request *req) -{ - intel_ring_reserved_space_cancel(req->ringbuf); - - i915_gem_request_unreference(req); -} - struct drm_i915_gem_request * i915_gem_find_active_request(struct intel_engine_cs *engine) { @@ -3410,12 +3404,9 @@ int i915_gpu_idle(struct drm_device *dev) return PTR_ERR(req); ret = i915_switch_context(req); - if (ret) { - i915_gem_request_cancel(req); - return ret; - } - i915_add_request_no_flush(req); + if (ret) + return ret; } ret = intel_engine_idle(engine); @@ -4917,34 +4908,33 @@ i915_gem_init_hw(struct drm_device *dev) req = i915_gem_request_alloc(engine, NULL); if (IS_ERR(req)) { ret = PTR_ERR(req); -