sched: Improve load balancing in the presence of idle CPUs

2015-03-30 Thread Jason Low
uld this patch also help address some of the issue you are seeing? Signed-off-by: Jason Low --- kernel/sched/fair.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fdae26e..ba8ec1a 100644 --- a/kernel/sched/fair.c ++

Re: [PATCH V2] sched: Improve load balancing in the presence of idle CPUs

2015-03-31 Thread Jason Low
On Tue, 2015-03-31 at 14:28 +0530, Preeti U Murthy wrote: > Morten, > I am a bit confused about the problem you are pointing to. > I am unable to see the issue. What is it that I am missing ? Hi Preeti, Here is one of the potential issues that have been described from my understanding. In sit

Re: sched: Improve load balancing in the presence of idle CPUs

2015-03-31 Thread Jason Low
On Tue, 2015-03-31 at 14:07 +0530, Preeti U Murthy wrote: > Hi Jason, > > On 03/31/2015 12:25 AM, Jason Low wrote: > > Hi Preeti, > > > > I noticed that another commit 4a725627f21d converted the check in > > nohz_kick_needed() from idle_cpu() to rq->idl

Re: [PATCH 19/25] sched: Use bool function return values of true/false not 1/0

2015-03-31 Thread Jason Low
On Mon, Mar 30, 2015 at 4:46 PM, Joe Perches wrote: > * try_wait_for_completion - try to decrement a completion without > blocking > * @x: completion structure > * > - * Return: 0 if a decrement cannot be done without blocking > - * 1 if a decrement succeeded.

Re: [PATCH v3] locking/rwsem: reduce spinlock contention in wakeup after up_read/up_write

2015-04-28 Thread Jason Low
On Tue, 2015-04-28 at 19:17 +0200, Peter Zijlstra wrote: > To me it makes more sense to reverse these two branches (identical code > wise of course) and put the special case first. > > Alternatively we could also do something like the below, which to my > eyes looks a little better still, but I d

Re: [PATCH v3] locking/rwsem: reduce spinlock contention in wakeup after up_read/up_write

2015-04-28 Thread Jason Low
On Tue, 2015-04-28 at 10:50 -0700, Jason Low wrote: > On Tue, 2015-04-28 at 19:17 +0200, Peter Zijlstra wrote: > > > To me it makes more sense to reverse these two branches (identical code > > wise of course) and put the special case first. > > > > Alternatively w

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-07 Thread Jason Low
On Fri, 2015-04-03 at 15:35 -0700, Tim Chen wrote: > I think we can get rid of the done_balancing boolean > and make it a bit easier to read if we change the above code to > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index bcfe320..08317dc 100644 > --- a/kernel/sched/fair.c > +++

Re: [PATCH v2 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-04-07 Thread Jason Low
On Tue, Apr 7, 2015 at 5:04 AM, Peter Zijlstra wrote: > That smells like something we should be able to do without a lock. > > If we use {READ,WRITE}_ONCE() on those two fields (->active_timers and > ->next_timer) we should be able to do this without the spinlock. Yeah, when atomics were suggest

Re: [PATCH v2 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-04-07 Thread Jason Low
On Tue, 2015-04-07 at 21:17 +0200, Thomas Gleixner wrote: > On Tue, 7 Apr 2015, Jason Low wrote: > > The lock shouldn't be used in get_next_timer_interrupt() either right? > > > > unsigned long get_next_timer_interrupt(unsigned long now) > > { > > ...

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-07 Thread Jason Low
On Tue, 2015-04-07 at 12:39 -0700, Tim Chen wrote: > How about consolidating the code for passing the > nohz balancing and call it at both places. > Something like below. Make the code more readable. > > Tim > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 40667cb..16f6904 1

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-07 Thread Jason Low
On Sat, 2015-04-04 at 15:29 +0530, Preeti U Murthy wrote: > Solution 1: As exists in the mainline > Solution 2: nohz_idle_balance(); rebalance_domains() on the ILB CPU > Solution 3: Above patch. > > I observe that Solution 3 is not as aggressive in spreading load as > Solution 2. With Solution 2,

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-07 Thread Jason Low
On Tue, 2015-04-07 at 16:28 -0700, Jason Low wrote: > Okay, so perhaps we can also try continuing nohz load balancing if we > find that there are overloaded CPUs in the system. Something like the following. --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fdae26e..d636bf7

[PATCH 0/2] locking: Simplify mutex and rwsem spinning code

2015-04-08 Thread Jason Low
This patchset applies on top of tip. Jason Low (2): locking/mutex: Further refactor mutex_spin_on_owner() locking/rwsem: Use a return variable in rwsem_spin_on_owner() kernel/locking/mutex.c | 14 -- kernel/locking/rwsem-xadd.c | 25 - 2 files

[PATCH 1/2] locking/mutex: Further refactor mutex_spin_on_owner()

2015-04-08 Thread Jason Low
off-by: Jason Low --- kernel/locking/mutex.c | 14 -- 1 files changed, 4 insertions(+), 10 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 16b2d3c..4cccea6 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -224,20 +

[PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()

2015-04-08 Thread Jason Low
Ingo suggested for mutex_spin_on_owner() that having multiple return statements is not the cleanest approach, especially when holding locks. The same thing applies to the rwsem variant. This patch rewrites much of this function to use a "ret" return value. Signed-off-by: Jason Low -

Re: [PATCH 0/2] locking: Simplify mutex and rwsem spinning code

2015-04-08 Thread Jason Low
On Wed, 2015-04-08 at 12:49 -0700, Davidlohr Bueso wrote: > On Wed, 2015-04-08 at 12:39 -0700, Jason Low wrote: > > This patchset applies on top of tip. > > > > Jason Low (2): > > locking/mutex: Further refactor mutex_spin_on_owner() > > lockin

Re: [PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()

2015-04-09 Thread Jason Low
On Thu, 2015-04-09 at 11:16 -0700, Linus Torvalds wrote: > On Thu, Apr 9, 2015 at 11:08 AM, Linus Torvalds > wrote: > > > > The pointer is a known-safe kernel pointer - it's just that it was > > "known safe" a few instructions ago, and might be rcu-free'd at any > > time. > > Actually, we could e

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-08 Thread Jason Low
On Wed, 2015-04-08 at 16:42 +0530, Srikar Dronamraju wrote: > * Jason Low [2015-04-07 17:07:46]: > > @@ -7687,7 +7700,7 @@ static inline bool nohz_kick_needed(struct rq *rq) > > int nr_busy, cpu = rq->cpu; > > bool kick = false; > > > > - if (un

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-08 Thread Jason Low
On Wed, 2015-04-08 at 16:42 +0530, Srikar Dronamraju wrote: > * Jason Low [2015-04-07 17:07:46]: > > > On Tue, 2015-04-07 at 16:28 -0700, Jason Low wrote: > > > > > Okay, so perhaps we can also try continuing nohz load balancing if we > > > find that the

Re: [PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()

2015-04-09 Thread Jason Low
On Thu, Apr 9, 2015 at 12:58 PM, Paul E. McKenney wrote: > On Thu, Apr 09, 2015 at 12:43:38PM -0700, Jason Low wrote: >> On Thu, 2015-04-09 at 11:16 -0700, Linus Torvalds wrote: >> > On Thu, Apr 9, 2015 at 11:08 AM, Linus Torvalds >> > wrote: >> > > &g

Re: [PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()

2015-04-09 Thread Jason Low
On Thu, 2015-04-09 at 11:16 -0700, Linus Torvalds wrote: > On Thu, Apr 9, 2015 at 11:08 AM, Linus Torvalds > wrote: > > > > The pointer is a known-safe kernel pointer - it's just that it was > > "known safe" a few instructions ago, and might be rcu-free'd at any > > time. > > Actually, we could e

[PATCH v2 1/5] sched, timer: Remove usages of ACCESS_ONCE in the scheduler

2015-04-28 Thread Jason Low
ACCESS_ONCE doesn't work reliably on non-scalar types. This patch removes the rest of the existing usages of ACCESS_ONCE in the scheduler, and use the new READ_ONCE and WRITE_ONCE APIs. Signed-off-by: Jason Low --- include/linux/sched.h |4 ++-- kernel/f

[PATCH v2 0/5] sched, timer: Improve scalability of itimers

2015-04-28 Thread Jason Low
d and timer, patch 1 also updates all existing usages of ACCESS_ONCE with the new READ_ONCE and WRITE_ONCE APIs in those areas. Jason Low (5): sched, timer: Remove usages of ACCESS_ONCE in the scheduler sched, numa: Document usages of mm->numa_scan_seq sched, timer: Use atomics in thread_group_

[PATCH v2 5/5] sched, timer: Use the atomic task_cputime in thread_group_cputimer

2015-04-28 Thread Jason Low
atomically, which also helps generalize the code. Suggested-by: Ingo Molnar Signed-off-by: Jason Low --- include/linux/init_task.h |6 ++ include/linux/sched.h |4 +--- kernel/sched/stats.h |6 +++--- kernel/time/posix-cpu-timers.c | 26

[PATCH v2 3/5] sched, timer: Use atomics in thread_group_cputimer to improve scalability

2015-04-28 Thread Jason Low
neric atomics and did not find the overhead to be much of an issue. An explanation for why this isn't an issue is that 32 bit systems usually have small numbers of CPUs, and cacheline contention from extra spinlocks called periodically is not really apparent on smaller systems. Signed-off-by:

[PATCH v2 2/5] sched, numa: Document usages of mm->numa_scan_seq

2015-04-28 Thread Jason Low
The p->mm->numa_scan_seq is accessed using READ_ONCE/WRITE_ONCE and modified without exclusive access. It is not clear why it is accessed this way. This patch provides some documentation on that. Signed-off-by: Jason Low --- kernel/sched/fair.c | 12 1 files chang

[PATCH v2 4/5] sched, timer: Provide an atomic task_cputime data structure

2015-04-28 Thread Jason Low
This patch adds an atomic variant of the task_cputime data structure, which can be used to store and update task_cputime statistics without needing to do locking. Suggested-by: Ingo Molnar Signed-off-by: Jason Low --- include/linux/sched.h | 17 + 1 files changed, 17

Re: [PATCH v2 1/5] sched, timer: Remove usages of ACCESS_ONCE in the scheduler

2015-04-29 Thread Jason Low
On Wed, 2015-04-29 at 13:15 -0400, Steven Rostedt wrote: > On Wed, 29 Apr 2015 13:05:55 -0400 > Waiman Long wrote: > > > > goto no_join; > > > @@ -2107,7 +2107,7 @@ void task_numa_fault(int last_cpupid, int mem_node, > > > int pages, int flags) > > > > > > static void reset_p

Re: [PATCH v2 2/5] sched, numa: Document usages of mm->numa_scan_seq

2015-04-29 Thread Jason Low
On Wed, 2015-04-29 at 14:14 -0400, Waiman Long wrote: > On 04/28/2015 04:00 PM, Jason Low wrote: > > The p->mm->numa_scan_seq is accessed using READ_ONCE/WRITE_ONCE > > and modified without exclusive access. It is not clear why it is > > accessed this way. This patch pro

Re: [PATCH v2 3/5] sched, timer: Use atomics in thread_group_cputimer to improve scalability

2015-04-29 Thread Jason Low
On Wed, 2015-04-29 at 14:43 -0400, Waiman Long wrote: > On 04/28/2015 04:00 PM, Jason Low wrote: > > void thread_group_cputimer(struct task_struct *tsk, struct task_cputime > > *times) > > { > > struct thread_group_cputimer *cputimer =&tsk->signal->

Re: [PATCH v2 3/5] sched, timer: Use atomics in thread_group_cputimer to improve scalability

2015-04-29 Thread Jason Low
On Wed, 2015-04-29 at 10:38 -0400, Rik van Riel wrote: > On 04/28/2015 04:00 PM, Jason Low wrote: > > While running a database workload, we found a scalability issue with > > itimers. > > > > Much of the problem was caused by the thread_group_cputimer spinlock. > &g

Re: [PATCH v2 2/5] sched, numa: Document usages of mm->numa_scan_seq

2015-04-30 Thread Jason Low
On Thu, 2015-04-30 at 14:42 -0400, Waiman Long wrote: > I do have a question of what kind of tearing you are talking about. Do > you mean the tearing due to mm being changed in the middle of the > access? The reason why I don't like this kind of construct is that I am > not sure if > the addres

Re: [PATCH v4 1/2] locking/rwsem: reduce spinlock contention in wakeup after up_read/up_write

2015-04-30 Thread Jason Low
will just quit. > > Signed-off-by: Waiman Long > Suggested-by: Peter Zijlstra (Intel) Acked-by: Jason Low -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vge

Re: [PATCH v2 2/5] sched, numa: Document usages of mm->numa_scan_seq

2015-04-30 Thread Jason Low
On Thu, 2015-04-30 at 11:54 -0700, Davidlohr Bueso wrote: > I also wonder why this patch is included in a set called > "sched, timer: Improve scalability of itimers" ;) Good point :) The reason these first 2 patches were included in this patchset is because patch 3 depended on patch 1 (particul

[PATCH v3 2/5] sched, numa: Document usages of mm->numa_scan_seq

2015-04-30 Thread Jason Low
On Thu, 2015-04-30 at 14:13 -0700, Jason Low wrote: > On Thu, 2015-04-30 at 14:42 -0400, Waiman Long wrote: > > > I do have a question of what kind of tearing you are talking about. Do > > you mean the tearing due to mm being changed in the middle of the > > access? The

Re: [PATCH] locking/rwsem: reduce spinlock contention in wakeup after up_read/up_write

2015-04-20 Thread Jason Low
On Fri, 2015-04-17 at 22:03 -0400, Waiman Long wrote: > diff --git a/include/linux/osq_lock.h b/include/linux/osq_lock.h > index 3a6490e..703ea5c 100644 > --- a/include/linux/osq_lock.h > +++ b/include/linux/osq_lock.h > @@ -32,4 +32,9 @@ static inline void osq_lock_init(struct > optimistic_spin_

Re: [PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()

2015-04-08 Thread Jason Low
On Thu, 2015-04-09 at 07:37 +0200, Ingo Molnar wrote: > The 'break' path does not seem to be equivalent, we used to do: > > > - rcu_read_unlock(); > > - return false; > > and now we'll do: > > > + ret = false; > ... > > + if (!READ_ONCE(se

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-13 Thread Jason Low
On Fri, 2015-04-10 at 14:07 +0530, Srikar Dronamraju wrote: > > > > > > > > #ifdef CONFIG_NO_HZ_COMMON > > > > +static inline bool nohz_kick_needed(struct rq *rq); > > > > + > > > > +static inline void pass_nohz_balance(struct rq *this_rq, int this_cpu) > > > > +{ > > > > + clear_bit(NOHZ_BA

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-13 Thread Jason Low
On Fri, 2015-04-10 at 14:07 +0530, Srikar Dronamraju wrote: > At this point, I also wanted to understand why we do > "nohz.next_balance++" nohz_balancer_kick()? So this looks like something that was added to avoid nohz_balancer_kick() getting called too frequently. Otherwise, it may get called in

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-13 Thread Jason Low
> > --- > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index fdae26e..d636bf7 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -7620,6 +7620,16 @@ out: > > } > > > > #ifdef CONFIG_NO_HZ_COMMON > > +static inline bool nohz_kick_needed(struct rq *rq); > >

Re: sched: Improve load balancing in the presence of idle CPUs

2015-04-13 Thread Jason Low
On Mon, 2015-04-13 at 15:49 -0700, Jason Low wrote: > hmm, so taking a look at the patch again, it looks like we pass nohz > balance even when the NOHZ_BALANCE_KICK is not set on the current CPU. > We should separate the 2 conditions: > > if (!test_bit(NOHZ_BALANCE_KICK, nohz

Re: [PATCH 2/4] locking/rtmutex: Use cmp-cmpxchg

2015-06-15 Thread Jason Low
Hi David, On Sat, Jun 6, 2015 at 8:27 AM, Davidlohr Bueso wrote: > On Fri, 2015-06-05 at 14:38 +0200, Thomas Gleixner wrote: >> On Tue, 19 May 2015, Davidlohr Bueso wrote: >> >> > Avoid unnecessary cmpxchg calls, all of our other locks >> > use it as well. >> > >> > Signed-off-by: Davidlohr Bueso

Re: [PATCH 2/4] locking/rtmutex: Use cmp-cmpxchg

2015-06-15 Thread Jason Low
On Mon, Jun 15, 2015 at 12:37 PM, Davidlohr Bueso wrote: > On Mon, 2015-06-15 at 11:34 -0700, Jason Low wrote: >> The CCAS technique was typically used in the slow paths for those >> other locks, where the chance of the operation returning false is >> higher. > > That

Re: [PATCH 0/8] locking/core patches

2014-02-25 Thread Jason Low
On Mon, 2014-02-10 at 15:02 -0800, Andrew Morton wrote: > On Mon, 10 Feb 2014 20:58:20 +0100 Peter Zijlstra > wrote: > > > Hi all, > > > > I would propose merging the following patches... > > > > The first set is mostly from Jason and tweaks the mutex adaptive > > spinning, AIM7 throughput num

Re: [PATCH 5/8] locking, mutex: Cancelable MCS lock for adaptive spinning

2014-02-25 Thread Jason Low
On Mon, 2014-02-10 at 20:58 +0100, Peter Zijlstra wrote: > +unqueue: > + /* > + * Step - A -- stabilize @prev > + * > + * Undo our @prev->next assignment; this will make @prev's > + * unlock()/unqueue() wait for a next pointer since @lock points to us > + * (or later)

Re: [PATCH 5/8] locking, mutex: Cancelable MCS lock for adaptive spinning

2014-02-26 Thread Jason Low
On Wed, 2014-02-26 at 10:22 +0100, Peter Zijlstra wrote: > On Tue, Feb 25, 2014 at 11:56:19AM -0800, Jason Low wrote: > > On Mon, 2014-02-10 at 20:58 +0100, Peter Zijlstra wrote: > > > + for (;;) { > > > + if (prev->next == node && > > >

Re: [tip:core/locking] locking/mutexes: Unlock the mutex without the wait_lock

2014-03-12 Thread Jason Low
On Wed, 2014-03-12 at 13:24 +0100, Peter Zijlstra wrote: > On Tue, Mar 11, 2014 at 05:41:23AM -0700, tip-bot for Jason Low wrote: > > kernel/locking/mutex.c | 8 > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/locking/mutex

Re: [locking/mutexes] WARNING: CPU: 1 PID: 77 at kernel/locking/mutex-debug.c:82 debug_mutex_unlock()

2014-03-12 Thread Jason Low
Hi Fengguang, Can you try out this patch? https://lkml.org/lkml/2014/3/12/243 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at

Re: [locking/mutexes] WARNING: CPU: 1 PID: 77 at kernel/locking/mutex-debug.c:82 debug_mutex_unlock()

2014-03-14 Thread Jason Low
On Fri, 2014-03-14 at 15:31 -0400, Sasha Levin wrote: > On 03/12/2014 09:46 PM, Jason Low wrote: > > Hi Fengguang, > > > > Can you try out this patch? > > > > https://lkml.org/lkml/2014/3/12/243 > > Hi Jason, > > It fixes the problem for me. Hi Sasha

Re: [tip:x86/urgent] x86 idle: Repair large-server 50-watt idle-power regression

2014-03-18 Thread Jason Low
On Tue, Mar 18, 2014 at 2:16 AM, Peter Zijlstra wrote: > On Mon, Mar 17, 2014 at 05:20:10PM -0700, Davidlohr Bueso wrote: >> On Thu, 2013-12-19 at 11:51 -0800, tip-bot for Len Brown wrote: >> > Commit-ID: 40e2d7f9b5dae048789c64672bf3027fbb663ffa >> > Gitweb: >> > http://git.kernel.org/tip/40

Re: [PATCH 5/8] locking, mutex: Cancelable MCS lock for adaptive spinning

2014-02-10 Thread Jason Low
On Mon, 2014-02-10 at 20:58 +0100, Peter Zijlstra wrote: > +void osq_unlock(struct optimistic_spin_queue **lock) > +{ > + struct optimistic_spin_queue *node = this_cpu_ptr(&osq_node); > + struct optimistic_spin_queue *next; > + > + /* > + * Fast path for the uncontended case. > +

Re: [PATCH 5/8] locking, mutex: Cancelable MCS lock for adaptive spinning

2014-02-10 Thread Jason Low
On Mon, 2014-02-10 at 22:32 +0100, Peter Zijlstra wrote: > On Mon, Feb 10, 2014 at 01:15:59PM -0800, Jason Low wrote: > > On Mon, 2014-02-10 at 20:58 +0100, Peter Zijlstra wrote: > > > +void osq_unlock(struct optimistic_spin_queue **lock) > > > +{ > > >

Re: [PATCH 3/8] mutex: Modify the way optimistic spinners are queued

2014-02-10 Thread Jason Low
om > Cc: chegu_vi...@hp.com > Cc: waiman.l...@hp.com > Cc: paul...@linux.vnet.ibm.com > Cc: torva...@linux-foundation.org > Signed-off-by: Jason Low > Signed-off-by: Peter Zijlstra > Link: > http://lkml.kernel.org/r/1390936396-3962-3-git-send-email-jason

Re: [PATCH 2/2] sched: add statistic for rq->max_idle_balance_cost

2014-01-22 Thread Jason Low
On Wed, 2014-01-22 at 17:09 +0100, Peter Zijlstra wrote: > On Wed, Jan 22, 2014 at 04:24:13PM +0800, Alex Shi wrote: > > From: Alex Shi > > Date: Tue, 21 Jan 2014 13:28:55 +0800 > > Subject: [RFC PATCH] sched: add statistic for rq->max_idle_balance_cost > > > > It's useful to track this value in

[RFC 0/3] mutex: Reduce spinning contention when there is no lock owner

2014-01-14 Thread Jason Low
While optimistic spinning is beneficial to performance, I have found that threads can potentially spin for a long time while there is no lock owner during high contention cases. In these scenarios, too much spinning can reduce performance. This RFC patchset attempts to address some of the issues wi

[RFC 1/3] mutex: In mutex_can_spin_on_owner(), return false if task need_resched()

2014-01-14 Thread Jason Low
The mutex_can_spin_on_owner() function should also return false if the task needs to be rescheduled. Signed-off-by: Jason Low --- kernel/locking/mutex.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 4dd6e4c

[RFC 3/3] mutex: When there is no owner, stop spinning after too many tries

2014-01-14 Thread Jason Low
value, but any suggestions on another method to determine the threshold are welcomed. Signed-off-by: Jason Low --- kernel/locking/mutex.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index b500cc7..9465604 1006

[RFC 2/3] mutex: Modify the way optimistic spinners are queued

2014-01-14 Thread Jason Low
Signed-off-by: Jason Low --- kernel/locking/mutex.c | 13 - 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 85c6be1..b500cc7 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -419,6 +419,7 @@ __mu

Re: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries

2014-01-14 Thread Jason Low
On Tue, 2014-01-14 at 17:00 -0800, Andrew Morton wrote: > On Tue, 14 Jan 2014 16:33:10 -0800 Jason Low wrote: > > > When running workloads that have high contention in mutexes on an 8 socket > > machine, spinners would often spin for a long time with no lock owner. > >

Re: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries

2014-01-14 Thread Jason Low
On Tue, 2014-01-14 at 17:06 -0800, Davidlohr Bueso wrote: > On Tue, 2014-01-14 at 16:33 -0800, Jason Low wrote: > > When running workloads that have high contention in mutexes on an 8 socket > > machine, spinners would often spin for a long time with no lock owner. > > >

Re: [RFC 2/3] mutex: Modify the way optimistic spinners are queued

2014-01-15 Thread Jason Low
On Wed, 2014-01-15 at 10:10 -0500, Waiman Long wrote: > On 01/14/2014 07:33 PM, Jason Low wrote: > > * When there's no owner, we might have preempted between the > > @@ -503,8 +504,10 @@ __mutex_lock_common(struct mutex *lock, long state, >

Re: [PATCH 4/5] futex: Avoid taking hb lock if nothing to wakeup

2013-11-22 Thread Jason Low
On Fri, Nov 22, 2013 at 5:25 PM, Linus Torvalds wrote: > On Fri, Nov 22, 2013 at 4:56 PM, Davidlohr Bueso wrote: >> In futex_wake() there is clearly no point in taking the hb->lock if >> we know beforehand that there are no tasks to be woken. This comes >> at the smaller cost of doing some atomic

Re: [RFC PATCH 2/5] futex: add optimistic spinning to FUTEX_SPIN_LOCK

2014-07-21 Thread Jason Low
On Mon, 2014-07-21 at 11:24 -0400, Waiman Long wrote: > This patch adds code to do optimistic spinning for the FUTEX_SPIN_LOCK > primitive on the futex value when the lock owner is running. It is > the same optimistic spinning technique that is done in the mutex and > rw semaphore code to improve t

[PATCH 0/3] sched: Idle balance patches

2014-04-23 Thread Jason Low
nit(). Patch #3 is a performance related patch. It stops searching for more tasks to pull while traversing the domains in idle balance if we find that there are runnable tasks. This patch resulted in approximately a 6% performance improvement to a Java server workload on an 8 socket machine. Jason Low

[PATCH 2/3] sched: Initialize newidle balance stats in sd_numa_init()

2014-04-23 Thread Jason Low
Also initialize the per-sd variables for newidle load balancing in sd_numa_init(). Signed-off-by: Jason Low --- kernel/sched/core.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 3f12533..b1c6cb9 100644 --- a/kernel

[PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-23 Thread Jason Low
owsing the domains. Signed-off-by: Jason Low --- kernel/sched/fair.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 43232b8..3e3ffb8 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6658,6 +6658,7 @@

[PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks

2014-04-23 Thread Jason Low
a 6% performance improvement when running a Java Server workload on an 8 socket machine. Signed-off-by: Jason Low --- kernel/sched/fair.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3e3ffb8..232518c 100644 --- a/ker

Re: [PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks

2014-04-24 Thread Jason Low
On Thu, 2014-04-24 at 04:51 +0200, Mike Galbraith wrote: > On Wed, 2014-04-23 at 18:30 -0700, Jason Low wrote: > > It was found that when running some workloads (such as AIM7) on large > > systems > > with many cores, CPUs do not remain idle for long. Thus, tasks can > &

Re: [PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks

2014-04-24 Thread Jason Low
On Thu, 2014-04-24 at 09:15 +0200, Peter Zijlstra wrote: > On Wed, Apr 23, 2014 at 06:30:35PM -0700, Jason Low wrote: > > It was found that when running some workloads (such as AIM7) on large > > systems > > with many cores, CPUs do not remain idle for long. Thus, tasks can

Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-24 Thread Jason Low
On Thu, 2014-04-24 at 14:44 +0200, Peter Zijlstra wrote: > On Thu, Apr 24, 2014 at 02:04:15PM +0200, Peter Zijlstra wrote: > > On Thu, Apr 24, 2014 at 03:44:47PM +0530, Preeti U Murthy wrote: > > > What about the update of next_balance field? See the code snippet below. > > > This will also be skip

Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-24 Thread Jason Low
On Thu, 2014-04-24 at 19:14 +0200, Peter Zijlstra wrote: > On Thu, Apr 24, 2014 at 09:53:37AM -0700, Jason Low wrote: > > > > So I thought that the original rationale (commit 1bd77f2d) behind > > updating rq->next_balance in idle_balance() is that, if we are going > >

Re: [PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks

2014-04-24 Thread Jason Low
On Thu, 2014-04-24 at 18:52 +0200, Peter Zijlstra wrote: > On Thu, Apr 24, 2014 at 09:43:09AM -0700, Jason Low wrote: > > If the below patch is what you were referring to, I believe this > > can help too. This was also something that I was testing out before > > we went wit

Re: [PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks

2014-04-24 Thread Jason Low
On Fri, 2014-04-25 at 04:45 +0200, Mike Galbraith wrote: > On Thu, 2014-04-24 at 18:52 +0200, Peter Zijlstra wrote: > > On Thu, Apr 24, 2014 at 09:43:09AM -0700, Jason Low wrote: > > > If the below patch is what you were referring to, I believe this > > > can help t

Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-25 Thread Jason Low
On Fri, 2014-04-25 at 10:42 +0530, Preeti U Murthy wrote: > I agree with this. However I am concerned with an additional point that > I have mentioned in my reply to Peter's mail on this thread. > > Should we verify if rq->next_balance update is independent of > pulled_tasks? sd->balance_interval

Re: [PATCH -tip/master 2/7] locking/mutex: Document quick lock release when unlocking

2014-07-30 Thread Jason Low
On Sun, 2014-07-27 at 22:18 -0700, Davidlohr Bueso wrote: > When unlocking, we always want to reach the slowpath with the lock's counter > indicating it is unlocked. -- as returned by the asm fastpath call or by > explicitly setting it. While doing so, at least in theory, we can optimize > and allo

Re: [PATCH -tip/master 3/7] locking/mcs: Remove obsolete comment

2014-07-30 Thread Jason Low
On Sun, 2014-07-27 at 22:18 -0700, Davidlohr Bueso wrote: > ... as we clearly inline mcs_spin_lock() now. > > Signed-off-by: Davidlohr Bueso Acked-by: Jason Low -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord

Re: [PATCH -tip/master 4/7] locking/mutex: Refactor optimistic spinning code

2014-07-30 Thread Jason Low
On Sun, 2014-07-27 at 22:18 -0700, Davidlohr Bueso wrote: > +static bool mutex_optimistic_spin(struct mutex *lock, > + struct ww_acquire_ctx *ww_ctx, const bool > use_ww_ctx) > +{ > + struct task_struct *task = current; > + > + if (!mutex_can_spin_on_owner(lo

Re: [PATCH -tip/master 5/7] locking/mutex: Use MUTEX_SPIN_ON_OWNER when appropriate

2014-07-30 Thread Jason Low
nly depended on DEBUG and > SMP, it was ok to have the ->owner field conditional a bit > flexible. However by adding a new variable to the matter, > we can waste space with the unused field, ie: CONFIG_SMP && > (!CONFIG_MUTEX_SPIN_ON_OWNER && !CONFIG_DEBUG_MUTEX). > &

Re: [PATCH -tip v2 4/7] locking/mutex: Refactor optimistic spinning code

2014-07-30 Thread Jason Low
former a bit. Furthermore, this is similar to what we have in > rwsems. No logical changes. > > Signed-off-by: Davidlohr Bueso Acked-by: Jason Low -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.o

Re: [PATCH v3] locking/rwsem: Avoid double checking before try acquiring write lock

2014-09-17 Thread Jason Low
On Wed, 2014-09-17 at 11:34 +0200, Davidlohr Bueso wrote: > On Tue, 2014-09-16 at 17:16 -0700, Jason Low wrote: > > Commit 9b0fc9c09f1b checks for if there are known active lockers > > in order to avoid write trylocking using expensive cmpxchg() when > > it likely wouldn'

Re: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries

2014-01-15 Thread Jason Low
On Tue, 2014-01-14 at 16:33 -0800, Jason Low wrote: > When running workloads that have high contention in mutexes on an 8 socket > machine, spinners would often spin for a long time with no lock owner. > > One of the potential reasons for this is because a thread can be preempted >

Re: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries

2014-01-15 Thread Jason Low
On Thu, 2014-01-16 at 10:14 +0700, Linus Torvalds wrote: > On Thu, Jan 16, 2014 at 9:45 AM, Jason Low wrote: > > > > Any comments on the below change which unlocks the mutex before taking > > the lock->wait_lock to wake up a waiter? Thanks. > > Hmm. Doesn't

Re: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries

2014-01-16 Thread Jason Low
On Thu, 2014-01-16 at 13:05 +0100, Peter Zijlstra wrote: > On Wed, Jan 15, 2014 at 10:46:17PM -0800, Jason Low wrote: > > On Thu, 2014-01-16 at 10:14 +0700, Linus Torvalds wrote: > > > On Thu, Jan 16, 2014 at 9:45 AM, Jason Low wrote: > > > > > > > > Any

Re: [PATCH 2/2] sched: add statistic for rq->max_idle_balance_cost

2014-01-20 Thread Jason Low
On Mon, Jan 20, 2014 at 9:33 PM, Alex Shi wrote: > It's useful to track this value in debug mode. > > Signed-off-by: Alex Shi > --- > kernel/sched/debug.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c > index 1e43e70..f5c529a 100644 > --- a/

[PATCH v2] locking/rwsem: Avoid double checking before try acquiring write lock

2014-09-16 Thread Jason Low
ing that cmpxchg(). Thus, commit 9b0fc9c09f1b now just adds extra overhead. This patch deletes it. Also, add a comment on why we do an "extra check" of sem->count before the cmpxchg(). Signed-off-by: Jason Low --- kernel/locking/rwsem-xadd.c | 24 +--- 1 files c

Re: [PATCH v2] locking/rwsem: Avoid double checking before try acquiring write lock

2014-09-16 Thread Jason Low
On Tue, 2014-09-16 at 16:08 -0400, Peter Hurley wrote: > Hi Jason, > > On 09/16/2014 03:01 PM, Jason Low wrote: > > Commit 9b0fc9c09f1b checks for if there are known active lockers in > > order to avoid write trylocking using expensive cmpxchg() when it > > l

[PATCH v3] locking/rwsem: Avoid double checking before try acquiring write lock

2014-09-16 Thread Jason Low
ing that cmpxchg(). Thus, commit 9b0fc9c09f1b now just adds overhead. This patch modifies it so that we only do a check for if count == RWSEM_WAITING_BIAS. Also, add a comment on why we do an "extra check" of count before the cmpxchg(). Cc: Peter Hurley Cc: Tim Chen Signed-off-by: Jaso

Re: [tip:core/locking] locking/mutexes: Modify the way optimistic spinners are queued

2014-03-11 Thread Jason Low
On Tue, 2014-03-11 at 05:41 -0700, tip-bot for Jason Low wrote: > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index e6d646b..82dad2c 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -403,9 +403,9 @@ __mutex_lock_common(struct mutex *l

Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-25 Thread Jason Low
On Fri, 2014-04-25 at 09:58 +0200, Mike Galbraith wrote: > On Fri, 2014-04-25 at 00:13 -0700, Jason Low wrote: > > On Fri, 2014-04-25 at 10:42 +0530, Preeti U Murthy wrote: > > > I agree with this. However I am concerned with an additional point that > > > I have menti

Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-25 Thread Jason Low
ance() and rebalance_domains() use that function. Signed-off-by: Jason Low --- kernel/sched/fair.c | 81 --- 1 files changed, 51 insertions(+), 30 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 43232b8..09c546c 100644 -

Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-28 Thread Jason Low
On Sat, 2014-04-26 at 16:50 +0200, Peter Zijlstra wrote: > On Fri, Apr 25, 2014 at 12:54:14PM -0700, Jason Low wrote: > > Preeti mentioned that sd->balance_interval is changed during load_balance(). > > Should we also consider updating the interval in rebalance_domains()

Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted

2014-04-28 Thread Jason Low
On Sun, 2014-04-27 at 14:01 +0530, Preeti Murthy wrote: > Hi Jason, Peter, > > The below patch looks good to me except for one point. > > In idle_balance() the below code snippet does not look right: > > - if (pulled_task || time_after(jiffies, this_rq->next_balance)) { > - /* > - * We are going

[PATCH 2/2] sched: Fix next_balance logic in rebalance_domains() and idle_balance()

2014-04-28 Thread Jason Low
en.rasmus...@arm.com Cc: as...@hp.com Cc: mi...@kernel.org Reviewed-by: Preeti U Murthy Signed-off-by: Jason Low --- kernel/sched/fair.c | 68 +- 1 files changed, 45 insertions(+), 23 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sch

[PATCH 1/2] sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance()

2014-04-28 Thread Jason Low
check/update those values even if a task gets enqueued while browsing the domains. Cc: daniel.lezc...@linaro.org Cc: alex@linaro.org Cc: efa...@gmx.de Cc: vincent.guit...@linaro.org Cc: morten.rasmus...@arm.com Cc: as...@hp.com Cc: mi...@kernel.org Reviewed-by: Preeti U Murthy Signed-off-by:

[PATCH 0/2] sched: Idle load balance fixes

2014-04-28 Thread Jason Low
can cause rq->max_idle_balance_cost and rq->next_balance to not get updated when it should. Patch #2 fixes how rq->next_balance gets updated in idle_balance() and rebalance_domains(). Jason Low (2): sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_ba

Re: [tip:locking/core] rwsem: Support optimistic spinning

2014-05-19 Thread Jason Low
On Mon, May 19, 2014 at 3:39 PM, Davidlohr Bueso wrote: > On Mon, 2014-05-19 at 14:47 -0700, Davidlohr Bueso wrote: >> Andrew had put this patch in -next for a while, and Stephen Rothwell was >> able to trigger some warnings: https://lkml.org/lkml/2014/5/19/627 >> >> 8<

Re: [PATCH v2] rwsem: Support optimistic spinning

2014-04-30 Thread Jason Low
On Wed, Apr 30, 2014 at 2:04 AM, Peter Zijlstra wrote: > On Mon, Apr 28, 2014 at 03:09:01PM -0700, Davidlohr Bueso wrote: >> +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) >> +{ >> + int retval; > > And yet the return value is bool. > >> + struct task_struct *owner;

Re: [PATCH v3] rwsem: Support optimistic spinning

2014-05-01 Thread Jason Low
On Thu, May 1, 2014 at 9:39 AM, Tim Chen wrote: > On Wed, 2014-04-30 at 20:21 -0700, Davidlohr Bueso wrote: > >> + >> +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) >> +{ >> + struct task_struct *owner; >> + bool on_cpu = true; >> + >> + if (need_resched()) >> +

Re: [PATCH 2/2] sched: Fix next_balance logic in rebalance_domains() and idle_balance()

2014-05-08 Thread Jason Low
On Mon, 2014-04-28 at 15:45 -0700, Jason Low wrote: > Currently, in idle_balance(), we update rq->next_balance when we pull_tasks. > However, it is also important to update this in the !pulled_tasks case too. > > When the CPU is "busy" (the CPU isn't idle), rq->n

Re: [PATCH 2/2] sched: Fix next_balance logic in rebalance_domains() and idle_balance()

2014-05-08 Thread Jason Low
On Thu, 2014-05-08 at 19:38 +0200, Ingo Molnar wrote: > * Jason Low wrote: > > > On Mon, 2014-04-28 at 15:45 -0700, Jason Low wrote: > > > Currently, in idle_balance(), we update rq->next_balance when we > > > pull_tasks. > > > However, it is also imp

<    1   2   3   4   5   >