Re: [Intel-gfx] [PATCH] drm/mm: Break long searches in fragmented address spaces

2020-02-13 Thread Chris Wilson
Quoting Andi Shyti (2020-02-11 22:56:44)
> Hi Chris,
> 
> On Fri, Feb 07, 2020 at 03:17:20PM +, Chris Wilson wrote:
> > We try hard to select a suitable hole in the drm_mm first time. But if
> > that is unsuccessful, we then have to look at neighbouring nodes, and
> > this requires traversing the rbtree. Walking the rbtree can be slow
> > (much slower than a linear list for deep trees), and if the drm_mm has
> > been purposefully fragmented our search can be trapped for a long, long
> > time. For non-preemptible kernels, we need to break up long CPU bound
> > sections by manually checking for cond_resched(); similarly we should
> 
> checking for "fatal signals" you mean?

We need to call schedule() ourselves for non-voluntary CONFIG_PREEMPT
kernels, and we need to escape from the kernel back to userspace to
handle a pending signal in this process.

Other applications, sysadmins tend to notice and complain about driver
hogging a CPU not letting anything else run; typically the user notices
that their application doesn't close on ^C.

(That is schedule() delays of a few ms will be noticed, but it takes a
couple of seconds before a user will become aware of a severe problem
with a stuck signal. :)

Now since responding to a signal itself may be expensive (we have to
restart the syscall and re-process all the state from before this
point), we make the trade-off here of only responding to a fatal-signal
(process exit), which should allow for faster system recovery under
pathological conditions.
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH] drm/mm: Break long searches in fragmented address spaces

2020-02-11 Thread Andi Shyti
Hi Chris,

On Fri, Feb 07, 2020 at 03:17:20PM +, Chris Wilson wrote:
> We try hard to select a suitable hole in the drm_mm first time. But if
> that is unsuccessful, we then have to look at neighbouring nodes, and
> this requires traversing the rbtree. Walking the rbtree can be slow
> (much slower than a linear list for deep trees), and if the drm_mm has
> been purposefully fragmented our search can be trapped for a long, long
> time. For non-preemptible kernels, we need to break up long CPU bound
> sections by manually checking for cond_resched(); similarly we should

checking for "fatal signals" you mean?

> also bail out if we have been told to terminate. (In an ideal world, we
> would break for any signal, but we need to trade off having to perform
> the search again after ERESTARTSYS, which again may form a trap of
> making no forward progress.)
> 
> Reported-by: Zbigniew Kempczyński 
> Signed-off-by: Chris Wilson 
> Cc: Zbigniew Kempczyński 
> Cc: Joonas Lahtinen 

nice!

Reviewed-by: Andi Shyti 

Thanks,
Andi

> ---
>  drivers/gpu/drm/drm_mm.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
> index 2a6e34663146..47d5de9ca0a8 100644
> --- a/drivers/gpu/drm/drm_mm.c
> +++ b/drivers/gpu/drm/drm_mm.c
> @@ -45,6 +45,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -366,6 +367,11 @@ next_hole(struct drm_mm *mm,
> struct drm_mm_node *node,
> enum drm_mm_insert_mode mode)
>  {
> + /* Searching is slow; check if we ran out of time/patience */
> + cond_resched();
> + if (fatal_signal_pending(current))
> + return NULL;
> +
>   switch (mode) {
>   default:
>   case DRM_MM_INSERT_BEST:
> @@ -557,7 +563,7 @@ int drm_mm_insert_node_in_range(struct drm_mm * const mm,
>   return 0;
>   }
>  
> - return -ENOSPC;
> + return signal_pending(current) ? -ERESTARTSYS : -ENOSPC;
>  }
>  EXPORT_SYMBOL(drm_mm_insert_node_in_range);
>  
> -- 
> 2.25.0
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel