Re: [Intel-gfx] [PATCH 2/2] drm/i915: Fix timeout handling in i915_gem_shrinker_vmap

2019-01-09 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-01-09 15:07:34)
> 
> On 09/01/2019 14:23, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-01-09 14:12:47)
> >> From: Tvrtko Ursulin 
> >>
> >> The code tries to grab struct mutex for 5ms every time the unlocked GPU
> >> idle wait succeeds. But the GPU idle wait itself is practically unbound
> >> which means the 5ms timeout might not be honoured.
> > 
> > The arbitrary timeout is merely advisory. If we hang, we get to recover
> > everything we can!
> 
> You mean if we hang, eg. take more than 5ms to idle, the function bails 
> out on the 5ms timeout check, failing to take the lock and free anything? ;)

No, just that I picked 5 _seconds_ as arbitrary timeout. As it now hits
the mutex_lock() path, the whole polling + timeout is kaput. What I
think we should do is use mutex_lock_killable instead.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Fix timeout handling in i915_gem_shrinker_vmap

2019-01-09 Thread Tvrtko Ursulin


On 09/01/2019 14:23, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2019-01-09 14:12:47)

From: Tvrtko Ursulin 

The code tries to grab struct mutex for 5ms every time the unlocked GPU
idle wait succeeds. But the GPU idle wait itself is practically unbound
which means the 5ms timeout might not be honoured.


The arbitrary timeout is merely advisory. If we hang, we get to recover
everything we can!


You mean if we hang, eg. take more than 5ms to idle, the function bails 
out on the 5ms timeout check, failing to take the lock and free anything? ;)


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Fix timeout handling in i915_gem_shrinker_vmap

2019-01-09 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-01-09 14:12:47)
> From: Tvrtko Ursulin 
> 
> The code tries to grab struct mutex for 5ms every time the unlocked GPU
> idle wait succeeds. But the GPU idle wait itself is practically unbound
> which means the 5ms timeout might not be honoured.
> 
> Cap the GPU idle wait to 5ms as well to fix this.
> 
> v2:
>  * Rebase.
> 
> Signed-off-by: Tvrtko Ursulin 
> ---
>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> index ce539d38461c..4ee393028a93 100644
> --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> @@ -402,13 +402,12 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, 
> struct shrink_control *sc)
>  
>  static bool
>  shrinker_lock_uninterruptible(struct drm_i915_private *i915, bool *unlock,
> - int timeout_ms)
> + unsigned long timeout)
>  {
> -   unsigned long timeout = jiffies + 
> msecs_to_jiffies_timeout(timeout_ms);
> +   const unsigned long timeout_end = jiffies + timeout;
>  
> do {
> -   if (i915_gem_wait_for_idle(i915,
> -  0, MAX_SCHEDULE_TIMEOUT) == 0 &&
> +   if (i915_gem_wait_for_idle(i915, 0, timeout) == 0 &&

Hmm, I wonder if this code could be hitting i915_request_wait():

#if IS_ENABLED(CONFIG_LOCKDEP)
GEM_BUG_ON(debug_locks &&
   !!lockdep_is_held(&rq->i915->drm.struct_mutex) !=
   !!(flags & I915_WAIT_LOCKED));
#endif

as i915_gem_shrinker_vmap() could conceivably be called under
struct_mutex, e.g. i915_gem_object_pin_map(), and I'm sure I've seen the
vmap arena used for setting pte bits.

Onwards to removing that restriction!
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Fix timeout handling in i915_gem_shrinker_vmap

2019-01-09 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-01-09 14:12:47)
> From: Tvrtko Ursulin 
> 
> The code tries to grab struct mutex for 5ms every time the unlocked GPU
> idle wait succeeds. But the GPU idle wait itself is practically unbound
> which means the 5ms timeout might not be honoured.

The arbitrary timeout is merely advisory. If we hang, we get to recover
everything we can!
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx