Re: [RFC][PATCH v2 5/5] mutex: Give spinners a chance to spin_on_owner if need_resched() triggered while queued

2014-02-03 Thread Jason Low
On Mon, 2014-02-03 at 20:25 +0100, Peter Zijlstra wrote: > +void m_spin_unlock(struct m_spinlock **lock) > +{ > + struct m_spinlock *node = this_cpu_ptr(&m_node); > + struct m_spinlock *next; > + > + if (likely(cmpxchg(lock, node, NULL) == node)) > + return; At this curren

Re: [RFC][PATCH v2 5/5] mutex: Give spinners a chance to spin_on_owner if need_resched() triggered while queued

2014-02-06 Thread Jason Low
On Wed, 2014-02-05 at 16:44 -0500, Waiman Long wrote: > On 01/29/2014 06:51 AM, Peter Zijlstra wrote: > > On Tue, Jan 28, 2014 at 02:51:35PM -0800, Jason Low wrote: > >>> But urgh, nasty problem. Lemme ponder this a bit. > > OK, please have a very careful look at th

[PATCH v2] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-26 Thread Jason Low
und that it was able to reduce the overhead of the function by up to a factor of 20x. Cc: Yuyang Du Cc: Waiman Long Cc: Mel Gorman Cc: Mike Galbraith Cc: Rik van Riel Cc: Aswin Chandramouleeswaran Cc: Chegu Vinod Cc: Scott J Norton Signed-off-by: Jason Low --- kernel/sched/fair.c | 10

Re: [PATCH v2] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-27 Thread Jason Low
On Tue, 2014-08-26 at 16:24 -0700, Paul Turner wrote: > On Tue, Aug 26, 2014 at 4:11 PM, Jason Low wrote: > > Based on perf profiles, the update_cfs_rq_blocked_load function constantly > > shows up as taking up a noticeable % of system run time. This is especially > > ap

Re: [PATCH v2] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-28 Thread Jason Low
On Wed, 2014-08-27 at 16:32 -0700, Tim Chen wrote: > On Wed, 2014-08-27 at 10:34 -0700, Jason Low wrote: > > On Tue, 2014-08-26 at 16:24 -0700, Paul Turner wrote: > > > On Tue, Aug 26, 2014 at 4:11 PM, Jason Low wrote: > > > > Based on perf profiles, the updat

Re: [PATCH v2] sched: Reduce contention in update_cfs_rq_blocked_load

2014-09-02 Thread Jason Low
reduces the cacheline contention that would be unnecessary. Cc: Yuyang Du Cc: Aswin Chandramouleeswaran Cc: Chegu Vinod Cc: Scott J Norton Reviewed-by: Ben Segall Reviewed-by: Waiman Long Signed-off-by: Jason Low --- kernel/sched/fair.c |3 +++ 1 files changed, 3 insertions(+), 0 delet

Re: [PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-05 Thread Jason Low
On Tue, 2014-08-05 at 03:15 +0800, Yuyang Du wrote: > Hi Jason, > > I am not sure whether you noticed my latest work: rewriting per entity load > average > > http://article.gmane.org/gmane.linux.kernel/1760754 > http://article.gmane.org/gmane.linux.kernel/1760755 > http://article.gmane.org/gmane

Re: [PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-06 Thread Jason Low
On Tue, 2014-08-05 at 03:15 +0800, Yuyang Du wrote: > Hi Jason, > > I am not sure whether you noticed my latest work: rewriting per entity load > average > > http://article.gmane.org/gmane.linux.kernel/1760754 > http://article.gmane.org/gmane.linux.kernel/1760755 > http://article.gmane.org/gmane

Re: [PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-07 Thread Jason Low
On Fri, 2014-08-08 at 02:02 +0800, Yuyang Du wrote: > On Wed, Aug 06, 2014 at 11:21:35AM -0700, Jason Low wrote: > > I ran these tests with most of the AIM7 workloads to compare its > > performance between a 3.16 kernel and the kernel with these patches > > applied. >

Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup

2014-08-08 Thread Jason Low
> __visible __used noinline > @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int > nested) > if (__mutex_slowpath_needs_to_unlock()) > atomic_set(&lock->count, 1); > > +/* > + * Skipping the mutex_has_owner() check when DEBUG, allows us to > + * avo

Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup

2014-08-08 Thread Jason Low
On Fri, 2014-08-08 at 13:21 -0700, Davidlohr Bueso wrote: > On Fri, 2014-08-08 at 12:50 -0700, Jason Low wrote: > > > __visible __used noinline > > > @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, > > > int nested) > > >

Re: [PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-11 Thread Jason Low
On Mon, 2014-08-04 at 13:52 -0700, bseg...@google.com wrote: > > That said, it might be better to remove force_update for this function, > or make it just reduce the minimum to /64 or something. If the test is > easy to run it would be good to see what it's like just removing the > force_update pa

Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning

2014-08-03 Thread Jason Low
On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: > The rwsem_can_spin_on_owner() function currently allows optimistic > spinning only if the owner field is defined and is running. That is > too conservative as it will cause some tasks to miss the opportunity > of doing spinning in case the own

[PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-04 Thread Jason Low
a factor of 3x: 1.18%reaim [kernel.kallsyms][k] update_cfs_rq_blocked_load Signed-off-by: Jason Low --- kernel/sched/fair.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index bfa3c86..8d4cc72 100644 ---

Re: [PATCH 1/7] locking/rwsem: don't resched at the end of optimistic spinning

2014-08-04 Thread Jason Low
On Mon, 2014-08-04 at 22:48 +0200, Peter Zijlstra wrote: > On Mon, Aug 04, 2014 at 02:36:35PM -0400, Waiman Long wrote: > > On 08/04/2014 03:55 AM, Peter Zijlstra wrote: > > >On Sun, Aug 03, 2014 at 10:36:16PM -0400, Waiman Long wrote: > > >>For a fully preemptive kernel, a call to preempt_enable()

Re: [PATCH 3/7] locking/rwsem: check for active writer/spinner before wakeup

2014-08-04 Thread Jason Low
On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: > On a highly contended rwsem, spinlock contention due to the slow > rwsem_wake() call can be a significant portion of the total CPU cycles > used. With writer lock stealing and writer optimistic spinning, there > is also a pretty good chance th

Re: [PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-04 Thread Jason Low
On Mon, 2014-08-04 at 13:52 -0700, bseg...@google.com wrote: > Jason Low writes: > > > When running workloads on 2+ socket systems, based on perf profiles, the > > update_cfs_rq_blocked_load function constantly shows up as taking up a > > noticeable % of run time. This i

[RFC PATCH] timer: Improve itimers scalability

2015-08-04 Thread Jason Low
ore than 30%. This patch addresses this by having the thread_group_cputimer structure maintain a boolean to signify when a thread in the group is already checking for process wide timers, and adds extra logic in the fastpath to check the boolean. Signed-off-by: Jason Low --- include/linux/init_task.h

Re: [RFC PATCH] timer: Improve itimers scalability

2015-08-05 Thread Jason Low
On Wed, 2015-08-05 at 11:37 +0200, Peter Zijlstra wrote: > On Tue, Aug 04, 2015 at 05:29:44PM -0700, Jason Low wrote: > > > @@ -1137,6 +1148,13 @@ static inline int fastpath_timer_check(struct > > task_struct *tsk) > > if (READ_ONCE(sig->cputimer.runni

[PATCH 2/3] timer: Check thread timers only when there are active thread timers

2015-08-25 Thread Jason Low
there are no per-thread timers. Signed-off-by: Jason Low --- kernel/time/posix-cpu-timers.c | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c index 02596ff..535bef5 100644 --- a/kernel/time/posix-cpu

[PATCH 1/3] timer: Optimize fastpath_timer_check()

2015-08-25 Thread Jason Low
timers set. Signed-off-by: Jason Low --- kernel/time/posix-cpu-timers.c | 15 +++ 1 files changed, 7 insertions(+), 8 deletions(-) diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c index 892e3da..02596ff 100644 --- a/kernel/time/posix-cpu-timers.c +++ b

[PATCH 0/3] timer: Improve itimers scalability

2015-08-25 Thread Jason Low
time is spent trying to acquire the sighand lock. It was found in some cases that 200+ threads were simultaneously contending for the same sighand lock, reducing throughput by more than 30%. Jason Low (3): timer: Optimize fastpath_timer_check() timer: Check thread timers only when there are

[PATCH 3/3] timer: Reduce unnecessary sighand lock contention

2015-08-25 Thread Jason Low
tain a boolean to signify when a thread in the group is already checking for process wide timers, and adds extra logic in the fastpath to check the boolean. Signed-off-by: Jason Low --- include/linux/init_task.h |1 + include/linux/sched.h |3 +++ kernel/time/

Re: [PATCH 0/3] timer: Improve itimers scalability

2015-08-26 Thread Jason Low
Hi Andrew, On Tue, 2015-08-25 at 20:27 -0700, Andrew Morton wrote: > On Tue, 25 Aug 2015 20:17:45 -0700 Jason Low wrote: > > > When running a database workload on a 16 socket machine, there were > > scalability issues related to itimers. > > > > Commit 1018016c706

Re: [PATCH 1/3] timer: Optimize fastpath_timer_check()

2015-08-26 Thread Jason Low
On Wed, 2015-08-26 at 12:57 -0400, George Spelvin wrote: > > if (!task_cputime_zero(&tsk->cputime_expires)) { > >+struct task_cputime task_sample; > >+cputime_t utime, stime; > >+ > >+task_cputime(tsk, &utime, &stime); > >+task_sample.utime = utim

Re: [PATCH 2/3] timer: Check thread timers only when there are active thread timers

2015-08-26 Thread Jason Low
On Wed, 2015-08-26 at 13:04 -0400, George Spelvin wrote: > - check_thread_timers(tsk, &firing); > + if (!task_cputime_zero(&tsk->cputime_expires)) > + check_thread_timers(tsk, &firing); > > Sincere question; I'm not certain myself: would it make more sense to put > this shortcu

Re: [PATCH 0/3] timer: Improve itimers scalability

2015-08-26 Thread Jason Low
On Wed, 2015-08-26 at 19:08 +0200, Oleg Nesterov wrote: > On 08/26, Jason Low wrote: > > > > Hi Andrew, > > > > On Tue, 2015-08-25 at 20:27 -0700, Andrew Morton wrote: > > > On Tue, 25 Aug 2015 20:17:45 -0700 Jason Low wrote: > > > > > > >

Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention

2015-08-26 Thread Jason Low
On Thu, 2015-08-27 at 00:31 +0200, Frederic Weisbecker wrote: > On Wed, Aug 26, 2015 at 10:53:35AM -0700, Linus Torvalds wrote: > > On Tue, Aug 25, 2015 at 8:17 PM, Jason Low wrote: > > > > > > This patch addresses this by having the thread_group_cputimer structure

Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention

2015-08-26 Thread Jason Low
On Thu, 2015-08-27 at 00:56 +0200, Frederic Weisbecker wrote: > On Tue, Aug 25, 2015 at 08:17:48PM -0700, Jason Low wrote: > > It was found while running a database workload on large systems that > > significant time was spent trying to acquire the sighand lock. > > &

Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention

2015-08-26 Thread Jason Low
On Wed, 2015-08-26 at 15:33 -0400, George Spelvin wrote: > And some more comments on the series... > > > @@ -626,6 +628,7 @@ struct task_cputime_atomic { > > struct thread_group_cputimer { > > struct task_cputime_atomic cputime_atomic; > > int running; > >+int checking_timer; > > }; >

Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention

2015-08-26 Thread Jason Low
On Wed, 2015-08-26 at 16:32 -0700, Jason Low wrote: > Perhaps to be safer, we use something like load_acquire() and > store_release() for accessing both the ->running and ->checking_timer > fields? Regarding using barriers, one option could be to pair them between sig->cputi

Re: [PATCH v2] sched: fix nohz.next_balance update

2015-08-27 Thread Jason Low
> > > > nohz_idle_balance must set the nohz.next_balance without taking into > > account this_rq->next_balance which is not updated yet. Then, this_rq will > > update nohz.next_update with its next_balance once updated and if necessary. > > > > Signed-off-by:

Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention

2015-08-27 Thread Jason Low
On Thu, 2015-08-27 at 14:53 +0200, Frederic Weisbecker wrote: > On Wed, Aug 26, 2015 at 04:32:34PM -0700, Jason Low wrote: > > On Thu, 2015-08-27 at 00:56 +0200, Frederic Weisbecker wrote: > > > On Tue, Aug 25, 2015 at 08:17:48PM -0700, Jason Low wrote: > > > >

Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention

2015-08-27 Thread Jason Low
On Wed, 2015-08-26 at 21:28 -0400, George Spelvin wrote: > > I can include your patch in the series and then use boolean for the new > > checking_timer field. However, it looks like this applies on an old > > kernel. For example, the spin_lock field has already been removed from > > the structure.

Re: [RFC PATCH] timer: Improve itimers scalability

2015-08-06 Thread Jason Low
On Thu, 2015-08-06 at 16:18 +0200, Oleg Nesterov wrote: > On 08/04, Jason Low wrote: > > > > @@ -973,13 +981,6 @@ static void check_process_timers(struct task_struct > > *tsk, > > virt_expires = check_timers_list(++timers, firing, utime); > > sched_exp

Re: [PATCH v2] MCS spinlock: Use smp_cond_load_acquire()

2016-04-13 Thread Jason Low
On Wed, 2016-04-13 at 10:43 -0700, Will Deacon wrote: > On Tue, Apr 12, 2016 at 08:02:17PM -0700, Jason Low wrote: > > For qspinlocks on ARM64, we would like to use WFE instead > > of purely spinning. Qspinlocks internally have lock > > contenders spin on an MCS

[RFC] arm64: Implement WFE based spin wait for MCS spinlocks

2016-04-14 Thread Jason Low
Use WFE to avoid most spinning with MCS spinlocks. This is implemented with the new cmpwait() mechanism for comparing and waiting for the MCS locked value to change using LDXR + WFE. Signed-off-by: Jason Low --- arch/arm64/include/asm/mcs_spinlock.h | 21 + 1 file changed

Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-09 Thread Jason Low
> >19.95% 5.88% fio [kernel.vmlinux] [k] rwsem_down_write_failed >14.20% 1.52% fio [kernel.vmlinux] [k] rwsem_down_write_failed > > The actual CPU cycles spend in rwsem_down_write_failed() dropped from > 5.88% to 1.52% after the patch. > > The xfstests was also run and no regression was observed. > > Signed-off-by: Waiman Long Acked-by: Jason Low

[PATCH] locking/rwsem: Optimize write lock slowpath

2016-05-09 Thread Jason Low
operations. We can instead make the list_is_singular() check first, and then set the count accordingly, so that we issue at most 1 atomic operation when acquiring the write lock and reduce unnecessary cacheline contention. Signed-off-by: Jason Low --- kernel/locking/rwsem-xadd.c | 20

Re: [PATCH] locking/rwsem: Add reader owned state to the owner field

2016-05-04 Thread Jason Low
On Wed, 2016-05-04 at 13:27 -0400, Waiman Long wrote: > On 05/03/2016 08:21 PM, Davidlohr Bueso wrote: > > On Wed, 27 Apr 2016, Waiman Long wrote: > >> static bool rwsem_optimistic_spin(struct rw_semaphore *sem) > >> @@ -378,7 +367,8 @@ static bool rwsem_optimistic_spin(struct > >> rw_semaphore *s

Re: [PATCH] locking/rwsem: Optimize write lock slowpath

2016-05-11 Thread Jason Low
On Wed, 2016-05-11 at 13:49 +0200, Peter Zijlstra wrote: > On Mon, May 09, 2016 at 12:16:37PM -0700, Jason Low wrote: > > When acquiring the rwsem write lock in the slowpath, we first try > > to set count to RWSEM_WAITING_BIAS. When that is successful, > > we th

Re: [PATCH] locking/rwsem: Optimize write lock slowpath

2016-05-11 Thread Jason Low
On Wed, 2016-05-11 at 11:33 -0700, Davidlohr Bueso wrote: > On Wed, 11 May 2016, Peter Zijlstra wrote: > > >On Mon, May 09, 2016 at 12:16:37PM -0700, Jason Low wrote: > >> When acquiring the rwsem write lock in the slowpath, we first try > >> to set count to RW

Re: [PATCH] locking/mutex: Set and clear owner using WRITE_ONCE()

2016-05-20 Thread Jason Low
On Fri, 2016-05-20 at 16:27 -0400, Waiman Long wrote: > On 05/19/2016 06:23 PM, Jason Low wrote: > > The mutex owner can get read and written to without the wait_lock. > > Use WRITE_ONCE when setting and clearing the owner field in order > > to avoid optimizations such a

[PATCH v2] locking/mutex: Set and clear owner using WRITE_ONCE()

2016-05-20 Thread Jason Low
use a partially written owner value. Signed-off-by: Jason Low Acked-by: Davidlohr Bueso --- kernel/locking/mutex-debug.h | 4 ++-- kernel/locking/mutex.h | 10 -- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/kernel/locking/mutex-debug.h b/kernel/locking/mutex

Re: [PATCH v4 2/5] locking/rwsem: Protect all writes to owner by WRITE_ONCE

2016-05-23 Thread Jason Low
On Sat, 2016-05-21 at 09:04 -0700, Peter Hurley wrote: > On 05/18/2016 12:58 PM, Jason Low wrote: > > It should be fine to use the standard READ_ONCE here, even if it's just > > for documentation, as it's probably not going to cost anything in > > practice. It woul

Re: [PATCH] locking/mutex: Set and clear owner using WRITE_ONCE()

2016-05-23 Thread Jason Low
On Fri, 2016-05-20 at 18:00 -0700, Davidlohr Bueso wrote: > On Fri, 20 May 2016, Waiman Long wrote: > > >I think mutex-debug.h also needs similar changes for completeness. > > Maybe, but given that with debug the wait_lock is unavoidable, doesn't > this send the wrong message? The mutex_set_owne

Re: [PATCH] locking/mutex: Set and clear owner using WRITE_ONCE()

2016-05-23 Thread Jason Low
On Mon, 2016-05-23 at 14:31 -0700, Davidlohr Bueso wrote: > On Mon, 23 May 2016, Jason Low wrote: > > >On Fri, 2016-05-20 at 18:00 -0700, Davidlohr Bueso wrote: > >> On Fri, 20 May 2016, Waiman Long wrote: > >> > >> >I think mutex-debug.h a

[PATCH v3] locking/mutex: Set and clear owner using WRITE_ONCE()

2016-05-24 Thread Jason Low
use a partially written owner value. This is not necessary in the debug case where the owner gets modified with the wait_lock held. Signed-off-by: Jason Low Acked-by: Davidlohr Bueso Acked-by: Waiman Long --- kernel/locking/mutex-debug.h | 5 + kernel/locking/mutex.h | 10

Re: [RFC][PATCH 0/7] locking/rwsem: Convert rwsem count to atomic_long_t

2016-05-17 Thread Jason Low
On Tue, 2016-05-17 at 13:09 +0200, Peter Zijlstra wrote: > On Mon, May 16, 2016 at 06:12:25PM -0700, Linus Torvalds wrote: > > On Mon, May 16, 2016 at 5:37 PM, Jason Low wrote: > > > > > > This rest of the series converts the rwsem count variable to an > > >

Re: [PATCH v4 2/5] locking/rwsem: Protect all writes to owner by WRITE_ONCE

2016-05-18 Thread Jason Low
On Wed, 2016-05-18 at 07:04 -0700, Davidlohr Bueso wrote: > On Tue, 17 May 2016, Waiman Long wrote: > > >Without using WRITE_ONCE(), the compiler can potentially break a > >write into multiple smaller ones (store tearing). So a read from the > >same data by another task concurrently may return a p

Re: [PATCH v4 2/5] locking/rwsem: Protect all writes to owner by WRITE_ONCE()

2016-05-18 Thread Jason Low
D_ONCE() may > >not be needed for rwsem->owner as long as the value is only used for > >comparison and not dereferencing. > > > >Signed-off-by: Waiman Long > > Yes, ->owner can obviously be handled locklessly during optimistic > spinning. > > Acked-by: Davidlohr Bueso Acked-by: Jason Low

Re: [PATCH v4 2/5] locking/rwsem: Protect all writes to owner by WRITE_ONCE

2016-05-18 Thread Jason Low
On Wed, 2016-05-18 at 14:29 -0400, Waiman Long wrote: > On 05/18/2016 01:21 PM, Jason Low wrote: > > On Wed, 2016-05-18 at 07:04 -0700, Davidlohr Bueso wrote: > >> On Tue, 17 May 2016, Waiman Long wrote: > >> > >>> Without using WRITE_ONCE(), the compiler c

Re: [PATCH v4 2/5] locking/rwsem: Protect all writes to owner by WRITE_ONCE

2016-05-19 Thread Jason Low
On Wed, 2016-05-18 at 12:58 -0700, Jason Low wrote: > On Wed, 2016-05-18 at 14:29 -0400, Waiman Long wrote: > > On 05/18/2016 01:21 PM, Jason Low wrote: > > > On Wed, 2016-05-18 at 07:04 -0700, Davidlohr Bueso wrote: > > >> On Tue, 17 May 2016, Waiman Long wrote

[PATCH] locking/mutex: Set and clear owner using WRITE_ONCE()

2016-05-19 Thread Jason Low
read and use a partially written owner value. Signed-off-by: Jason Low --- kernel/locking/mutex.h | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/locking/mutex.h b/kernel/locking/mutex.h index 5cda397..469b61e 100644 --- a/kernel/locking/mutex.h +++ b/kernel

[PATCH v2] locking/rwsem: Optimize write lock by reducing operations in slowpath

2016-05-16 Thread Jason Low
operations. We can instead make the list_is_singular() check first, and then set the count accordingly, so that we issue at most 1 atomic operation when acquiring the write lock and reduce unnecessary cacheline contention. Signed-off-by: Jason Low Acked-by: Waiman Long Acked-by: Davidlohr Bueso

[RFC][PATCH 0/7] locking/rwsem: Convert rwsem count to atomic_long_t

2016-05-16 Thread Jason Low
The first patch contains an optimization for acquiring the rwsem write lock in the slowpath. This rest of the series converts the rwsem count variable to an atomic_long_t since it is used it as an atomic variable. This allows us to also remove the rwsem_atomic_{add,update} abstraction and reduce 1

[RFC][PATCH 2/7] locking/rwsem: Convert sem->count to atomic_long_t

2016-05-16 Thread Jason Low
add,update} definitions across the various architectures. Suggested-by: Peter Zijlstra Signed-off-by: Jason Low --- include/linux/rwsem.h | 6 +++--- kernel/locking/rwsem-xadd.c | 31 --- 2 files changed, 19 insertions(+), 18 deletions(-) diff --git a/include/linux

[RFC][PATCH 1/7] locking/rwsem: Optimize write lock by reducing operations in slowpath

2016-05-16 Thread Jason Low
operations. We can instead make the list_is_singular() check first, and then set the count accordingly, so that we issue at most 1 atomic operation when acquiring the write lock and reduce unnecessary cacheline contention. Signed-off-by: Jason Low Acked-by: Waiman Long Acked-by: Davidlohr Bueso

[RFC][PATCH 3/7] locking,x86: Remove x86 rwsem add and rwsem update

2016-05-16 Thread Jason Low
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the x86 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/x86/include/asm/rwsem.h | 18

[RFC][PATCH 4/7] locking,alpha: Remove Alpha rwsem add and rwsem update

2016-05-16 Thread Jason Low
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the alpha implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/alpha/include/asm/rwsem.h

[RFC][PATCH 5/7] locking,ia64: Remove ia64 rwsem add and rwsem update

2016-05-16 Thread Jason Low
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the ia64 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/ia64/include/asm/rwsem.h | 7

[RFC][PATCH 6/7] locking,s390: Remove s390 rwsem add and rwsem update

2016-05-16 Thread Jason Low
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the s390 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/s390/include/asm/rwsem.h | 37

[RFC][PATCH 7/7] locking,asm-generic: Remove generic rwsem add and rwsem update definitions

2016-05-16 Thread Jason Low
The rwsem count has been converted to an atomic variable and we now directly use atomic_long_add() and atomic_long_add_return() on the count, so we can remove the asm-generic implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- include/asm-generic/rwsem.h

Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL.

2016-01-22 Thread Jason Low
On Fri, 2016-01-22 at 09:54 +0100, Peter Zijlstra wrote: > On Thu, Jan 21, 2016 at 06:02:34PM -0500, Waiman Long wrote: > > This patch attempts to fix this live-lock condition by enabling the > > a woken task in the wait list to enter optimistic spinning loop itself > > with precedence over the one

Re: [PATCH 0/2] locking/mutex: Enable optimistic spinning of lock waiter

2016-02-09 Thread Jason Low
7;s patch: > > 1) Ding Tianhong still find that hanging task could happen in some cases. > 2) Jason Low found that there was performance regression for some AIM7 > workloads. This might help address the hang that Ding reported. Performance wise, this patchset reduced AI

Re: [PATCH v2 1/4] locking/mutex: Add waiter parameter to mutex_optimistic_spin()

2016-02-15 Thread Jason Low
On Fri, 2016-02-12 at 14:14 -0800, Davidlohr Bueso wrote: > On Fri, 12 Feb 2016, Peter Zijlstra wrote: > > >On Fri, Feb 12, 2016 at 12:32:12PM -0500, Waiman Long wrote: > >> static bool mutex_optimistic_spin(struct mutex *lock, > >> +struct ww_acquire_ctx *ww_ctx, > >>

Re: [PATCH v2 1/4] locking/mutex: Add waiter parameter to mutex_optimistic_spin()

2016-02-15 Thread Jason Low
On Mon, 2016-02-15 at 18:15 -0800, Jason Low wrote: > On Fri, 2016-02-12 at 14:14 -0800, Davidlohr Bueso wrote: > > On Fri, 12 Feb 2016, Peter Zijlstra wrote: > > > > >On Fri, Feb 12, 2016 at 12:32:12PM -0500, Waiman Long wrote: > > >> static bool mute

Re: [PATCH v2 1/4] locking/mutex: Add waiter parameter to mutex_optimistic_spin()

2016-02-15 Thread Jason Low
On Mon, 2016-02-15 at 18:55 -0500, Waiman Long wrote: > On 02/12/2016 03:40 PM, Peter Zijlstra wrote: > > On Fri, Feb 12, 2016 at 12:32:12PM -0500, Waiman Long wrote: > >> @@ -358,8 +373,8 @@ static bool mutex_optimistic_spin(struct mutex *lock, > >>} > >> > >>

[PATCH] sched, timer: Fix documentation for 'struct thread_group_cputimer'

2015-05-08 Thread Jason Low
On Fri, 2015-05-08 at 06:22 -0700, tip-bot for Jason Low wrote: > Commit-ID: 1018016c706f7ff9f56fde3a649789c47085a293 > Gitweb: http://git.kernel.org/tip/1018016c706f7ff9f56fde3a649789c47085a293 > Author: Jason Low > AuthorDate: Tue, 28 Apr 2015 13:00:22 -0700 > Committe

Re: [PATCH v2 2/5] sched, numa: Document usages of mm->numa_scan_seq

2015-05-01 Thread Jason Low
On Fri, 2015-05-01 at 08:21 -0700, Paul E. McKenney wrote: > On Thu, Apr 30, 2015 at 02:13:07PM -0700, Jason Low wrote: > > On Thu, 2015-04-30 at 14:42 -0400, Waiman Long wrote: > > > > > I do have a question of what kind of tearing you are talking about. Do > > &g

Re: [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running

2015-03-09 Thread Jason Low
On Sat, 2015-03-07 at 10:21 +0100, Peter Zijlstra wrote: > On Fri, Mar 06, 2015 at 11:45:31PM -0800, Jason Low wrote: > > static noinline > > bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct > > *owner) > > { > > long count; > >

Re: [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running

2015-03-09 Thread Jason Low
On Sat, 2015-03-07 at 13:17 -0500, Sasha Levin wrote: > On 03/07/2015 02:45 AM, Jason Low wrote: > > Fixes tip commit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock > > spinners). > > > > Ming reported soft lockups occurring when running xfstest due to > > c

[PATCH] locking/mutex: Refactor mutex_spin_on_owner()

2015-03-09 Thread Jason Low
oop directly check for while (lock->owner == owner). This improves the readability of the code. Signed-off-by: Jason Low --- kernel/locking/mutex.c | 17 + 1 files changed, 5 insertions(+), 12 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 1

Re: [PATCH] locking/mutex: Refactor mutex_spin_on_owner()

2015-03-10 Thread Jason Low
On Tue, 2015-03-10 at 09:11 +0100, Ingo Molnar wrote: > * Jason Low wrote: > > > This patch applies on top of tip. > > > > --- > > Similar to what Linus suggested for rwsem_spin_on_owner(), in > &g

Re: [PATCH V2] sched: Improve load balancing in the presence of idle CPUs

2015-03-26 Thread Jason Low
s a chance to load balance. Else we may > + * load balance only within the local sched_domain hierarchy > + * and abort nohz_idle_balance altogether if we pull some load. >*/ > nohz_idle_balance(this_rq, idle); > + rebalance_domains(this_rq, idle); Reviewe

Re: [PATCH V2] sched: Improve load balancing in the presence of idle CPUs

2015-03-26 Thread Jason Low
On Fri, 2015-03-27 at 10:03 +0530, Preeti U Murthy wrote: > Hi Wanpeng > > On 03/27/2015 07:42 AM, Wanpeng Li wrote: > > Hi Preeti, > > On Thu, Mar 26, 2015 at 06:32:44PM +0530, Preeti U Murthy wrote: > >> > >> 1. An ILB CPU was chosen from the first numa domain to trigger nohz idle > >> load bala

Re: [PATCH V2] sched: Improve load balancing in the presence of idle CPUs

2015-03-26 Thread Jason Low
On Fri, 2015-03-27 at 10:12 +0800, Wanpeng Li wrote: > Hi Preeti, > On Thu, Mar 26, 2015 at 06:32:44PM +0530, Preeti U Murthy wrote: > > > >1. An ILB CPU was chosen from the first numa domain to trigger nohz idle > >load balancing [Given the experiment, upto 6 CPUs per core could be > >potentially

Re: [PATCH] sched/fair: fix update the nohz.next_balance even if we haven't done any load balance

2015-03-27 Thread Jason Low
need_resched()) > goto end; How about having: if (idle != CPU_IDLE || need_resched() || !test_bit(NOHZ_BALANCE_KICK, nohz_flags(this_cpu))) which wouldn't require adding a new line. Besides that: Reviewed-by: Jason Low -- To unsubscribe from this list: se

[PATCH] mm: Remove usages of ACCESS_ONCE

2015-03-23 Thread Jason Low
using separate/multiple sets of APIs. Signed-off-by: Jason Low --- mm/gup.c |4 ++-- mm/huge_memory.c |4 ++-- mm/internal.h|4 ++-- mm/ksm.c | 10 +- mm/memcontrol.c | 18 +- mm/memory.c |2 +- mm/mmap.c|8 ---

Re: [PATCH] mm: Remove usages of ACCESS_ONCE

2015-03-24 Thread Jason Low
On Tue, 2015-03-24 at 15:42 +0100, Christian Borntraeger wrote: > Am 23.03.2015 um 23:44 schrieb Jason Low: > > Commit 38c5ce936a08 converted ACCESS_ONCE usage in gup_pmd_range() to > > READ_ONCE, since ACCESS_ONCE doesn't work reliably on non-scalar types. > > > >

Re: [PATCH] mm: Remove usages of ACCESS_ONCE

2015-03-24 Thread Jason Low
On Tue, 2015-03-24 at 11:30 +0100, Michal Hocko wrote: > On Mon 23-03-15 15:44:40, Jason Low wrote: > > Commit 38c5ce936a08 converted ACCESS_ONCE usage in gup_pmd_range() to > > READ_ONCE, since ACCESS_ONCE doesn't work reliably on non-scalar types. > > > > Th

[PATCH v2 1/2] mm: Use READ_ONCE() for non-scalar types

2015-03-24 Thread Jason Low
es_fast() in mm/gup.c Signed-off-by: Jason Low Acked-by: Michal Hocko Acked-by: Davidlohr Bueso Acked-by: Rik van Riel Reviewed-by: Christian Borntraeger --- mm/gup.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index ca7b607..6297f6b 100644 --- a

[PATCH v2 2/2] mm: Remove rest of ACCESS_ONCE() usages

2015-03-24 Thread Jason Low
arate/multiple sets of APIs. Signed-off-by: Jason Low Acked-by: Michal Hocko Acked-by: Davidlohr Bueso Acked-by: Rik van Riel Reviewed-by: Christian Borntraeger --- mm/huge_memory.c |4 ++-- mm/internal.h|4 ++-- mm/ksm.c | 10 +- mm/memcontrol.c |

[PATCH v2 0/2] mm: Remove usages of ACCESS_ONCE()

2015-03-24 Thread Jason Low
v1->v2: As suggested by Michal, we can split up the v1 patch into 2 patches. The first patch addresses potentially incorrect usages of ACCESS_ONCE(). The second patch is more of a "cleanup" patch to convert the rest of the ACCESS_ONCE() reads in mm/ to use the new READ_ONCE() API. This makes

Re: [PATCH v2] sched, timer: Use atomics for thread_group_cputimer to improve scalability

2015-03-05 Thread Jason Low
On Thu, 2015-03-05 at 16:20 +0100, Frederic Weisbecker wrote: > On Mon, Mar 02, 2015 at 01:44:04PM -0800, Linus Torvalds wrote: > > On Mon, Mar 2, 2015 at 1:16 PM, Jason Low wrote: > > > > > > In original code, we set cputimer->running first so it is running while &

Re: [PATCH v2] sched, timer: Use atomics for thread_group_cputimer to improve scalability

2015-03-05 Thread Jason Low
On Thu, 2015-03-05 at 16:35 +0100, Frederic Weisbecker wrote: > On Mon, Mar 02, 2015 at 10:42:11AM -0800, Jason Low wrote: > > +/* Sample thread_group_cputimer values in "cputimer", copy results to > > "times" */ > > +static inline void sample_

Re: sched: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Fri, 2015-03-06 at 09:19 -0800, Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 13:32 +0100, Ingo Molnar wrote: > > * Sasha Levin wrote: > > > > > I've bisected this to "locking/rwsem: Check for active lock before > > > bailing on spinning". Relevant parties Cc'ed. > > > > That would be: > >

Re: sched: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Fri, 2015-03-06 at 11:05 -0800, Linus Torvalds wrote: > On Fri, Mar 6, 2015 at 10:57 AM, Jason Low wrote: > > > > Right, the can_spin_on_owner() was originally added to the mutex > > spinning code for optimization purposes, particularly so that we can > > avoid ad

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Fri, 2015-03-06 at 11:29 -0800, Jason Low wrote: > Hi Linus, > > Agreed, this is an issue we need to address, though we're just trying to > figure out if the change to rwsem_can_spin_on_owner() in "commit: > 37e9562453b" is really the one that's causing th

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Fri, 2015-03-06 at 13:24 -0800, Linus Torvalds wrote: > On Fri, Mar 6, 2015 at 1:12 PM, Jason Low wrote: > > > > + while (true) { > > + if (sem->owner != owner) > > + break; > > That looks *really* odd. > &g

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Fri, 2015-03-06 at 14:15 -0800, Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > In owner_running() there are 2 conditions that would make it return > > false: if the owner changed or if the owner is not running. However, > > that patch

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: > > On Sat, 2015-03-07 at 09:55 +0800, Ming Lei wrote: > >> On Fri, 06 Mar 2015 14:15:37 -0800 > >> Davidlohr Bueso wrote: > >> > >> > On

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Sat, 2015-03-07 at 11:08 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 10:56 AM, Jason Low wrote: > > On Sat, 2015-03-07 at 10:10 +0800, Ming Lei wrote: > >> On Sat, Mar 7, 2015 at 10:07 AM, Davidlohr Bueso wrote: > >> > On Sat, 2015-03-07 at 09:55 +0800, Ming

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Sat, 2015-03-07 at 11:39 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 11:17 AM, Jason Low wrote: > > On Sat, 2015-03-07 at 11:08 +0800, Ming Lei wrote: > >> On Sat, Mar 7, 2015 at 10:56 AM, Jason Low wrote: > >> > On Sat, 2015-03-07 at 10:10 +0800, Ming Le

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: Just in case, here's the updated patch which addresses Linus's comments and with a changelog. Note: The changelog says that it fixes (locking/rwsem: Avoid deceiving lock spinners), though I still haven't seen full confirmation th

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Fri, 2015-03-06 at 20:44 -0800, Davidlohr Bueso wrote: > On Fri, 2015-03-06 at 20:31 -0800, Jason Low wrote: > > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > > > Just in case, here's the updated patch which addresses Linus's comments > > and

Re: softlockups in multi_cpu_stop

2015-03-06 Thread Jason Low
On Sat, 2015-03-07 at 13:54 +0800, Ming Lei wrote: > On Sat, Mar 7, 2015 at 12:31 PM, Jason Low wrote: > > On Fri, 2015-03-06 at 13:12 -0800, Jason Low wrote: > > Cc: Ming Lei > > Cc: Davidlohr Bueso > > Signed-off-by: Jason Low > > Reported-and-tested-by

[PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running

2015-03-06 Thread Jason Low
ot;guess" why we broke out of the loop to make this more readable. Reported-and-tested-by: Ming Lei Acked-by: Davidlohr Bueso Signed-off-by: Jason Low --- kernel/locking/rwsem-xadd.c | 31 +++ 1 files changed, 11 insertions(+), 20 deletions(-) diff --git a/kernel

[PATCH 0/2] mutex: Modifications to mutex_spin_on_owner

2015-02-02 Thread Jason Low
This patchset contains a few modifications to mutex_spin_on_owner(). The first patch makes the optimistic spinner continue spinning whenever the owner changes, and the second patch refactors mutex_spin_on_owner() to micro optimize the code as well as make it simpler. Jason Low (2): mutex: In

[PATCH 1/2] mutex: In mutex_spin_on_owner(), return true when owner changes

2015-02-02 Thread Jason Low
with this patch. Signed-off-by: Jason Low --- kernel/locking/mutex.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 4541951..04ea9ce 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -237,1

<    1   2   3   4   5   >