Re: [PATCH] mutex: Report recursive ww_mutex locking early
Op 30-05-16 om 12:45 schreef Chris Wilson: > On Mon, May 30, 2016 at 12:27:46PM +0200, Peter Zijlstra wrote: >> On Mon, May 30, 2016 at 11:43:31AM +0200, Maarten Lankhorst wrote: >>> Patch not applied, SCHED_RR: >> ww_mutex isn't RT aware at all; its one of the things I still have on a >> todo list. Should I look harder at finding time for this? > The RT usage in the test is to just try and starve the kernel threads > that may be used behind the atomic modeset - a problem we have > encountered in the past. Afaik, no one is using ww_mutex from RT in the > wild, calling the atomic modeset from the RT was just a shortcut to > having the system fully populated with RT threads. To be more realistic > we should be using a couple of normal modesetting threads vs a set of RT > cpu hogs. Yeah, unfortunately this doesn't work as you intend it to. You'd need to spawn a few more threads at slightly lower priority so when a thread is blocked waiting for acquisition of the mutexes the workqueues still can't run. ssh is still responsive with the rest running. > Otoh, i915.ko always draws the ire of rt-linux so ww_mutex is likely to > be in their sights in the near future (when i915.ko completes its > transition to full atomic modesetting).
Re: [PATCH] mutex: Report recursive ww_mutex locking early
On Mon, May 30, 2016 at 12:27:46PM +0200, Peter Zijlstra wrote: > On Mon, May 30, 2016 at 11:43:31AM +0200, Maarten Lankhorst wrote: > > Patch not applied, SCHED_RR: > > ww_mutex isn't RT aware at all; its one of the things I still have on a > todo list. Should I look harder at finding time for this? The RT usage in the test is to just try and starve the kernel threads that may be used behind the atomic modeset - a problem we have encountered in the past. Afaik, no one is using ww_mutex from RT in the wild, calling the atomic modeset from the RT was just a shortcut to having the system fully populated with RT threads. To be more realistic we should be using a couple of normal modesetting threads vs a set of RT cpu hogs. Otoh, i915.ko always draws the ire of rt-linux so ww_mutex is likely to be in their sights in the near future (when i915.ko completes its transition to full atomic modesetting). -Chris -- Chris Wilson, Intel Open Source Technology Centre
Re: [PATCH] mutex: Report recursive ww_mutex locking early
On Mon, May 30, 2016 at 11:43:31AM +0200, Maarten Lankhorst wrote: > Patch not applied, SCHED_RR: ww_mutex isn't RT aware at all; its one of the things I still have on a todo list. Should I look harder at finding time for this?
Re: [PATCH] mutex: Report recursive ww_mutex locking early
Op 30-05-16 om 11:11 schreef Peter Zijlstra: > On Mon, May 30, 2016 at 09:43:53AM +0200, Maarten Lankhorst wrote: >> Op 26-05-16 om 22:08 schreef Chris Wilson: >>> Recursive locking for ww_mutexes was originally conceived as an >>> exception. However, it is heavily used by the DRM atomic modesetting >>> code. Currently, the recursive deadlock is checked after we have queued >>> up for a busy-spin and as we never release the lock, we spin until >>> kicked, whereupon the deadlock is discovered and reported. >>> >>> A simple solution for the now common problem is to move the recursive >>> deadlock discovery to the first action when taking the ww_mutex. >>> >>> Testcase: igt/kms_cursor_legacy > I've no idea what this tag is or where to find the actual testcase, so > I've killed it. https://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ tests/kms_cursor_legacy tries to do as many updates as possible with SCHED_RR.. Patch not applied, SCHED_RR: # ./kms_cursor_legacy IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64) [3] count=86 [2] count=91 [1] count=78 [0] count=104 Subtest stress-bo: SUCCESS (22,372s) Patch not applied, SCHED_NORMAL: # ./kms_cursor_legacy IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64) [2] count=4713 [0] count=4288 [3] count=4776 [1] count=4521 Subtest stress-bo: SUCCESS (21,492s) Patch applied, NORMAL + RR give roughly same results: # nfs/intel-gpu-tools/tests/kms_cursor_legacy IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64) [0] count=77631 [1] count=77740 [3] count=77612 [2] count=77666 Subtest stress-bo: SUCCESS (21,487s) >>> Suggested-by: Maarten Lankhorst >>> Signed-off-by: Chris Wilson >>> Cc: Peter Zijlstra >>> Cc: Ingo Molnar >>> Cc: Christian König >>> Cc: Maarten Lankhorst >>> Cc: linux-kernel@vger.kernel.org >>> --- >>> >>> Maarten suggested this as a simpler fix to the immediate problem. Imo, >>> we still want to perform deadlock detection within the spin in order to >>> catch more complicated deadlocks without osq_lock() forcing fairness! >> Reviewed-by: Maarten Lankhorst >> >> Should this be Cc: sta...@vger.kernel.org ? > Can do; how far back? >
Re: [PATCH] mutex: Report recursive ww_mutex locking early
On Mon, May 30, 2016 at 09:43:53AM +0200, Maarten Lankhorst wrote: > Op 26-05-16 om 22:08 schreef Chris Wilson: > > Recursive locking for ww_mutexes was originally conceived as an > > exception. However, it is heavily used by the DRM atomic modesetting > > code. Currently, the recursive deadlock is checked after we have queued > > up for a busy-spin and as we never release the lock, we spin until > > kicked, whereupon the deadlock is discovered and reported. > > > > A simple solution for the now common problem is to move the recursive > > deadlock discovery to the first action when taking the ww_mutex. > > > > Testcase: igt/kms_cursor_legacy I've no idea what this tag is or where to find the actual testcase, so I've killed it. > > Suggested-by: Maarten Lankhorst > > Signed-off-by: Chris Wilson > > Cc: Peter Zijlstra > > Cc: Ingo Molnar > > Cc: Christian König > > Cc: Maarten Lankhorst > > Cc: linux-kernel@vger.kernel.org > > --- > > > > Maarten suggested this as a simpler fix to the immediate problem. Imo, > > we still want to perform deadlock detection within the spin in order to > > catch more complicated deadlocks without osq_lock() forcing fairness! > Reviewed-by: Maarten Lankhorst > > Should this be Cc: sta...@vger.kernel.org ? Can do; how far back?
Re: [PATCH] mutex: Report recursive ww_mutex locking early
Op 26-05-16 om 22:08 schreef Chris Wilson: > Recursive locking for ww_mutexes was originally conceived as an > exception. However, it is heavily used by the DRM atomic modesetting > code. Currently, the recursive deadlock is checked after we have queued > up for a busy-spin and as we never release the lock, we spin until > kicked, whereupon the deadlock is discovered and reported. > > A simple solution for the now common problem is to move the recursive > deadlock discovery to the first action when taking the ww_mutex. > > Testcase: igt/kms_cursor_legacy > Suggested-by: Maarten Lankhorst > Signed-off-by: Chris Wilson > Cc: Peter Zijlstra > Cc: Ingo Molnar > Cc: Christian König > Cc: Maarten Lankhorst > Cc: linux-kernel@vger.kernel.org > --- > > Maarten suggested this as a simpler fix to the immediate problem. Imo, > we still want to perform deadlock detection within the spin in order to > catch more complicated deadlocks without osq_lock() forcing fairness! Reviewed-by: Maarten Lankhorst Should this be Cc: sta...@vger.kernel.org ? I think in the normal case things would move forward even with osq_lock, but you can make a separate patch to add it to mutex_can_spin_on_owner, with the same comment as in mutex_optimistic_spin. > --- > kernel/locking/mutex.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index d60f1ba3e64f..1659398dc8f8 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -502,9 +502,6 @@ __ww_mutex_lock_check_stamp(struct mutex *lock, struct > ww_acquire_ctx *ctx) > if (!hold_ctx) > return 0; > > - if (unlikely(ctx == hold_ctx)) > - return -EALREADY; > - > if (ctx->stamp - hold_ctx->stamp <= LONG_MAX && > (ctx->stamp != hold_ctx->stamp || ctx > hold_ctx)) { > #ifdef CONFIG_DEBUG_MUTEXES > @@ -530,6 +527,12 @@ __mutex_lock_common(struct mutex *lock, long state, > unsigned int subclass, > unsigned long flags; > int ret; > > + if (use_ww_ctx) { > + struct ww_mutex *ww = container_of(lock, struct ww_mutex, base); > + if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) > + return -EALREADY; > + } > + > preempt_disable(); > mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip); >