[PATCH v9 2/2] drm/i915/guc: Close deregister-context race against CT-loss

2023-12-27 Thread Alan Previn
If we are at the end of suspend or very early in resume its possible an async fence signal (via rcu_call) is triggered to free_engines which could lead us to the execution of the context destruction worker (after a prior worker flush). Thus, when suspending, insert rcu_barriers at the start of i91

[PATCH v9 1/2] drm/i915/guc: Flush context destruction worker at suspend

2023-12-27 Thread Alan Previn
When suspending, flush the context-guc-id deregistration worker at the final stages of intel_gt_suspend_late when we finally call gt_sanitize that eventually leads down to __uc_sanitize so that the deregistration worker doesn't fire off later as we reset the GuC microcontroller. Signed-off-by: Ala

[PATCH v9 0/2] Resolve suspend-resume racing with GuC destroy-context-worker

2023-12-27 Thread Alan Previn
This series is the result of debugging issues root caused to races between the GuC's destroyed_worker_func being triggered vs repeating suspend-resume cycles with concurrent delayed fence signals for engine-freeing. The reproduction steps require that an app is launched right before the start of t

Re: [PATCH v8 2/2] drm/i915/guc: Close deregister-context race against CT-loss

2023-12-27 Thread Teres Alexis, Alan Previn
On Tue, 2023-12-26 at 10:11 -0500, Vivi, Rodrigo wrote: > On Wed, Dec 20, 2023 at 11:08:59PM +, Teres Alexis, Alan Previn wrote: > > On Wed, 2023-12-13 at 16:23 -0500, Vivi, Rodrigo wrote: alan:snip > > > > > > alan: Thanks Rodrigo for the RB last week, just quick update: > > > > I've cant