On Wed, Oct 2, 2013 at 12:19 PM, Waiman Long wrote:
> On 09/26/2013 06:42 PM, Jason Low wrote:
>>
>> On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote:
>>>
>>> Okay, that would makes sense for consistency because we always
>>> first set node->lock
On Fri, 2013-08-30 at 12:29 +0200, Peter Zijlstra wrote:
> rcu_read_lock();
> for_each_domain(cpu, sd) {
> + /*
> + * Decay the newidle max times here because this is a regular
> + * visit to all the domains. Decay ~0.5% per second.
> +
On Fri, 2013-08-30 at 12:18 +0200, Peter Zijlstra wrote:
> On Thu, Aug 29, 2013 at 01:05:36PM -0700, Jason Low wrote:
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 58b0514..bba5a07 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/co
On Mon, 2013-09-09 at 13:44 +0200, Peter Zijlstra wrote:
> On Tue, Sep 03, 2013 at 11:02:59PM -0700, Jason Low wrote:
> > On Fri, 2013-08-30 at 12:29 +0200, Peter Zijlstra wrote:
> > > rcu_read_lock();
> > > for_each_domain(cpu, sd) {
> > > + /*
>
On Mon, 2013-09-09 at 13:49 +0200, Peter Zijlstra wrote:
> On Wed, Sep 04, 2013 at 12:10:01AM -0700, Jason Low wrote:
> > On Fri, 2013-08-30 at 12:18 +0200, Peter Zijlstra wrote:
> > > On Thu, Aug 29, 2013 at 01:05:36PM -0700, Jason Low wrote:
> > > > diff --git
() first. Then, if avg_idle exceeds the max, we set
it to the max.
Signed-off-by: Jason Low
Reviewed-by: Rik van Riel
Reviewed-by: Srikar Dronamraju
---
kernel/sched/core.c |7 ---
1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index
e CPU is not idle for longer than the cost
to balance.
Signed-off-by: Jason Low
---
arch/metag/include/asm/topology.h |1 +
include/linux/sched.h |1 +
include/linux/topology.h |3 +++
kernel/sched/core.c |3 ++-
kernel/sched/fair.c
| +23.1% | +5.1% | +0.0%
shared | +3.0% | +4.5% | +1.4%
--------
Jason Low (3):
sched: Reduce overestimating rq->avg_idle
sched: Consider max
v4->v5
- Increase the decay to 1% per second.
- Peter rewrote much of the logic.
This patch builds on patch 2 and periodically decays that max value to
do idle balancing per sched domain by approximately 1% per second. Also
decay the rq's max_idle_balance_cost value.
Signed-off-by: J
() first. Then, if avg_idle exceeds the max, we set
it to the max.
Signed-off-by: Jason Low
Reviewed-by: Rik van Riel
---
kernel/sched/core.c |7 ---
1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 05c39f0..93b18ef 100644
--- a
average. This further reduces the
chance
we attempt balancing when the CPU is not idle for longer than the cost to
balance.
I also limited the max cost of each domain to 5*sysctl_sched_migration_cost as
a way to prevent the max from becoming too inflated.
Signed-off-by: Jason Low
---
ar
-1.2%
shared | +9.0% | +13.0% | +6.5%
----
Jason Low (3):
sched: Reduce overestimating rq->avg_idle
sched: Consider max cost of idle balance p
e with max cost to do idle
balancing + sched_migration_cost. While using the max cost helps reduce
overestimating the average idle, the sched_migration_cost can help account
for those additional costs of idle balancing.
Signed-off-by: Jason Low
---
arch/metag/include/asm/topology.h |
On Mon, 2013-09-02 at 12:24 +0530, Srikar Dronamraju wrote:
> If we face a runq lock contention, then domain_cost can go up.
> The runq lock contention could be temporary, but we carry the domain
> cost forever (i.e till the next reboot). How about averaging the cost +
> penalty for unsuccessful b
On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen wrote:
> We will need the MCS lock code for doing optimistic spinning for rwsem.
> Extracting the MCS code from mutex.c and put into its own file allow us
> to reuse this code easily for rwsem.
>
> Signed-off-by: Tim Chen
> Signed-off-by: Davidlohr Bueso
On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote:
> On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote:
> > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen
> > wrote:
> > > We will need the MCS lock code for doing optimistic spinning for rwsem.
> > > Extract
On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote:
> On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote:
> > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote:
> > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote:
> > > > On Wed, Sep 25, 2013 at 3:1
On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote:
> On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote:
> > On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote:
> > > On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote:
> > > > On Thu, 2013-09-26 at 13:0
On Fri, 2013-09-27 at 08:02 +0200, Ingo Molnar wrote:
> * Tim Chen wrote:
>
> > > If we prefer to optimize this a bit though, perhaps we can first move
> > > the node->lock = 0 so that it gets executed after the "if (likely(prev
> > > == NULL)) {}" code block and then delete "node->lock = 1" in
ssignment so that it occurs after the if (likely(prev == NULL)) check.
This might also help make it clearer as to how the node->locked variable
is used in MCS locks.
Signed-off-by: Jason Low
---
include/linux/mcslock.h |3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/inc
On Fri, Sep 27, 2013 at 12:38 PM, Tim Chen wrote:
> BTW, is the above memory barrier necessary? It seems like the xchg
> instruction already provided a memory barrier.
>
> Now if we made the changes that Jason suggested:
>
>
> /* Init node */
> - node->locked = 0;
> node->n
ry barrier so that it is before the "ACCESS_ONCE(next->locked) = 1;".
Signed-off-by: Jason Low
Signed-off-by: Paul E. McKenney
Signed-off-by: Tim Chen
---
include/linux/mcslock.h |7 +++
1 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/include/linux/mcslock.h
On Fri, Sep 27, 2013 at 7:19 PM, Paul E. McKenney
wrote:
> On Fri, Sep 27, 2013 at 04:54:06PM -0700, Jason Low wrote:
>> On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney
>> wrote:
>> > Yep. The previous lock holder's smp_wmb() won't keep either the compil
On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote:
> On 09/28/2013 12:34 AM, Jason Low wrote:
> >> Also, below is what the mcs_spin_lock() and mcs_spin_unlock()
> >> functions would look like after applying the proposed changes.
> >>
> >> static
Should we do something similar with __down_read_trylock, such as
the following?
Signed-off-by: Jason Low
---
include/asm-generic/rwsem.h |3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h
index bb1e2cd..47990dc
ip kernel with no patches. When using a 3.10-rc2 tip
kernel with just patches 1-7, the performance improvement of the
workload over the vanilla 3.10-rc2 tip kernel was about 25%.
Tested-by: Jason Low
Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel&quo
On Tue, Jun 11, 2013 at 12:49 PM, Paul E. McKenney
wrote:
> On Tue, Jun 11, 2013 at 02:41:59PM -0400, Waiman Long wrote:
>> On 06/11/2013 12:36 PM, Paul E. McKenney wrote:
>> >
>> >>I am a bit concern about the size of the head queue table itself.
>> >>RHEL6, for example, had defined CONFIG_NR_CPU
---
All other % difference results were within a 2% noise range.
Signed-off-by: Jason Low
---
include/linux/sched.h |4
kernel/sched/core.c |3 +++
kernel/sched/fair.c | 26 ++
kernel/sched/sched.h |6 ++
kernel/sysctl.c | 11 ++
On Tue, 2013-07-16 at 22:20 +0200, Peter Zijlstra wrote:
> On Tue, Jul 16, 2013 at 12:21:03PM -0700, Jason Low wrote:
> > When running benchmarks on an 8 socket 80 core machine with a 3.10 kernel,
> > there can be a lot of contention in idle_balance() and related functions.
&
On Wed, 2013-07-17 at 09:25 +0200, Peter Zijlstra wrote:
> On Tue, Jul 16, 2013 at 03:48:01PM -0700, Jason Low wrote:
> > On Tue, 2013-07-16 at 22:20 +0200, Peter Zijlstra wrote:
> > > On Tue, Jul 16, 2013 at 12:21:03PM -0700, Jason Low wrote:
> > > > When running ben
Hi Peter,
On Wed, 2013-07-17 at 11:39 +0200, Peter Zijlstra wrote:
> On Wed, Jul 17, 2013 at 01:11:41AM -0700, Jason Low wrote:
> > For the more complex model, are you suggesting that each completion time
> > is the time it takes to complete 1 iteration of the for_each_do
On Wed, 2013-07-17 at 20:01 +0200, Peter Zijlstra wrote:
> On Wed, Jul 17, 2013 at 01:51:51PM -0400, Rik van Riel wrote:
> > On 07/17/2013 12:18 PM, Peter Zijlstra wrote:
>
> > >So the way I see things is that the only way newidle balance can slow down
> > >things is if it runs when we could have
On Wed, 2013-07-17 at 20:01 +0200, Peter Zijlstra wrote:
> On Wed, Jul 17, 2013 at 01:51:51PM -0400, Rik van Riel wrote:
> > On 07/17/2013 12:18 PM, Peter Zijlstra wrote:
>
> > >So the way I see things is that the only way newidle balance can slow down
> > >things is if it runs when we could have
idle balance
need to be the same as the migration_cost in task_hot()? Can we keep
migration_cost default value used in task_hot() the same, but have a different
default value or increase migration_cost only when comparing it with avg_idle in
idle balance?
Signed-off-by: Jason Low
---
kernel/sched/c
> I wonder if we could get even more conservative values
> of avg_idle by clamping delta to max, before calling
> update_avg...
>
> Or rather, I wonder if that would matter enough to make
> a difference, and in what direction that difference would
> be.
>
> In other words:
>
> if (rq->idl
On Wed, 2013-07-31 at 11:53 +0200, Peter Zijlstra wrote:
> No they're quite unrelated. I think you can measure the max time we've
> ever spend in newidle balance and use that to clip the values.
So I tried using the rq's max newidle balance cost to compare with the
average and used sysctl_migrati
hat
avg_idle and max_cost are) if the previous attempt on the rq or domain
succeeded in moving tasks. I was also wondering if we should periodically reset
the max cost. Both would require an extra field to be added to either the
rq or domain structure though.
Signed-off-by: Jason Low
---
arch/
On Thu, 2013-08-22 at 13:10 +0200, Peter Zijlstra wrote:
> Fully agreed, this is something we should do regardless -- for as long
> as we preserve the avg_idle() machinery anyway :-)
Okay, I'll have the avg_idle fix as part 1 of the v4 patchset.
> The thing you 'forgot' to mention is if this patc
On Thu, 2013-07-18 at 17:42 +0530, Srikar Dronamraju wrote:
> > >
> > > idle_balance(u64 idle_duration)
> > > {
> > > u64 cost = 0;
> > >
> > > for_each_domain(sd) {
> > > if (cost + sd->cost > idle_duration/N)
> > > break;
> > >
> > > ...
> > >
> > > sd->cost = (sd->cost
On Thu, 2013-07-18 at 07:59 -0400, Rik van Riel wrote:
> On 07/18/2013 05:32 AM, Peter Zijlstra wrote:
> > On Wed, Jul 17, 2013 at 09:02:24PM -0700, Jason Low wrote:
> >
> >> I ran a few AIM7 workloads for the 8 socket HT enabled case and I needed
> >> to set N to
balancing gets skipped if the approximate cost of load balancing will
be greater than N% of the approximate time the CPU remains idle. Currently,
N is set to 10% though I'm searching for a more "ideal" way to compute this.
Suggested-by: Peter Zijlstra
Suggested-by: Rik van Riel
Signe
On Fri, 2013-07-19 at 20:37 +0200, Peter Zijlstra wrote:
> On Thu, Jul 18, 2013 at 12:06:39PM -0700, Jason Low wrote:
>
> > N = 1
> > -
> > 19.21% reaim [k] __read_lock_failed
> > 14.79% reaim [k] mspin_lock
On Fri, 2013-07-19 at 16:54 +0530, Preeti U Murthy wrote:
> Hi Json,
>
> I ran ebizzy and kernbench benchmarks on your 3.11-rc1 + your"V1
> patch" on a 1 socket, 16 core powerpc machine. I thought I would let you
> know the results before I try your V2.
>
> Ebizzy: 30 seconds run. The tab
On Sun, 2013-07-21 at 23:02 +0530, Preeti U Murthy wrote:
> Hi Json,
>
> With V2 of your patch here are the results for the ebizzy run on
> 3.11-rc1 + patch on a 1 socket, 16 core powerpc machine. Each ebizzy
> run was for 30 seconds.
>
> Number_of_threads %improvement_with_patch
> 4
On Mon, 2013-07-22 at 12:31 +0530, Srikar Dronamraju wrote:
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index e8b3350..da2cb3e 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1348,6 +1348,8 @@ ttwu_do_wakeup(struct rq *rq, struct task_struct *p,
>
On Tue, 2013-07-23 at 16:36 +0530, Srikar Dronamraju wrote:
> >
> > A potential issue I have found with avg_idle is that it may sometimes be
> > not quite as accurate for the purposes of this patch, because it is
> > always given a max value (default is 100 ns). For example, a CPU
> > could ha
> > > Should we take the consideration of whether a idle_balance was
> > > successful or not?
> >
> > I recently ran fserver on the 8 socket machine with HT-enabled and found
> > that load balance was succeeding at a higher than average rate, but idle
> > balance was still lowering performance of
+10.7%
Signed-off-by: Jason Low
---
kernel/sched/core.c |1 +
kernel/sched/fair.c | 10 +-
kernel/sched/sched.h |5 +
3 files changed, 15 insertions(+), 1 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
On Mon, 2013-08-12 at 16:30 +0530, Srikar Dronamraju wrote:
> > /*
> > @@ -5298,6 +5300,8 @@ void idle_balance(int this_cpu, struct rq *this_rq)
> > continue;
> >
> > if (sd->flags & SD_BALANCE_NEWIDLE) {
> > + load_balance_attempted = true;
>
On Tue, 2016-07-19 at 19:53 +0300, Imre Deak wrote:
> On ma, 2016-07-18 at 10:47 -0700, Jason Low wrote:
> > On Mon, 2016-07-18 at 19:15 +0200, Peter Zijlstra wrote:
> > > I think we went over this before, that will also completely destroy
> > > performance
s disabled?
Thanks.
---
Signed-off-by: Jason Low
---
include/linux/mutex.h | 2 ++
kernel/locking/mutex.c | 61 +-
2 files changed, 58 insertions(+), 5 deletions(-)
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2cb7531..c1ca68d 10
On Tue, 2016-07-19 at 16:04 -0700, Jason Low wrote:
> Hi Imre,
>
> Here is a patch which prevents a thread from spending too much "time"
> waiting for a mutex in the !CONFIG_MUTEX_SPIN_ON_OWNER case.
>
> Would you like to try this out and see if this addresses the
On Wed, 2016-07-20 at 16:29 +0300, Imre Deak wrote:
> On ti, 2016-07-19 at 21:39 -0700, Jason Low wrote:
> > On Tue, 2016-07-19 at 16:04 -0700, Jason Low wrote:
> > > Hi Imre,
> > >
> > > Here is a patch which prevents a thread from spending too much &qu
On Wed, 2016-07-20 at 14:37 -0400, Waiman Long wrote:
> On 07/20/2016 12:39 AM, Jason Low wrote:
> > On Tue, 2016-07-19 at 16:04 -0700, Jason Low wrote:
> >> Hi Imre,
> >>
> >> Here is a patch which prevents a thread from spending too much &q
On Fri, 2016-07-22 at 12:34 +0300, Imre Deak wrote:
> On to, 2016-07-21 at 15:29 -0700, Jason Low wrote:
> > On Wed, 2016-07-20 at 14:37 -0400, Waiman Long wrote:
> > > On 07/20/2016 12:39 AM, Jason Low wrote:
> > > > On Tue, 2016-07-19 at 16:04 -0700, Jaso
On Tue, 2016-08-23 at 09:35 -0700, Jason Low wrote:
> On Tue, 2016-08-23 at 09:17 -0700, Davidlohr Bueso wrote:
> > I have not looked at the patches yet, but are there any performance minutia
> > to be aware of?
>
> This would remove all of the mutex architecture specific o
On Thu, 2015-08-27 at 18:43 -0400, George Spelvin wrote:
> Jason Low wrote:
> > Frederic suggested that we just use a single "status" variable and
> > access the bits for the running and checking field. I am leaning towards
> > that method, so I might not include the
On Mon, 2015-08-31 at 08:15 -0700, Davidlohr Bueso wrote:
> On Tue, 2015-08-25 at 20:17 -0700, Jason Low wrote:
> > In fastpath_timer_check(), the task_cputime() function is always
> > called to compute the utime and stime values. However, this is not
> > necessary if th
timers set.
Signed-off-by: Jason Low
Reviewed-by: Oleg Nesterov
Reviewed-by: Frederic Weisbecker
Reviewed-by: Davidlohr Bueso
---
kernel/time/posix-cpu-timers.c | 11 +++
1 files changed, 3 insertions(+), 8 deletions(-)
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix
throughput by more than 30%.
With this patch set (along with commit 1018016c706f mentioned above),
the performance hit of itimers almost completely goes away on the
16 socket system.
Jason Low (4):
timer: Optimize fastpath_timer_check()
timer: Check thread timers only when there are active thread
oleans.
This is a preparatory patch to convert the existing running integer
field to a boolean.
Suggested-by: George Spelvin
Signed-off-by: Jason Low
---
include/linux/init_task.h |2 +-
include/linux/sched.h |6 +++---
kernel/fork.c |2 +-
kernel/time/pos
there are no per-thread timers.
As suggested by George, we can put the task_cputime_zero() check in
check_thread_timers(), since that is more of an optization to the
function. Similarly, we move the existing check of cputimer->running
to check_process_timers().
Signed-off-by: Jason Low
Revie
the thread_group_cputimer structure
maintain a boolean to signify when a thread in the group is already
checking for process wide timers, and adds extra logic in the fastpath
to check the boolean.
Signed-off-by: Jason Low
Reviewed-by: Oleg Nesterov
---
include/linux/init_task.h |1
On Wed, 2015-10-14 at 17:18 -0400, George Spelvin wrote:
> I'm going to give 4/4 a closer look to see if the races with timer
> expiration make more sense to me than last time around.
> (E.g. do CPU time signals even work in CONFIG_NO_HZ_FULL?)
>
> But although I haven't yet convinced myself the c
On Thu, 2015-10-15 at 10:47 +0200, Ingo Molnar wrote:
> * Jason Low wrote:
>
> > While running a database workload on a 16 socket machine, there were
> > scalability issues related to itimers. The following link contains a
> > more detailed summary of the issues
On Fri, 2015-10-16 at 09:12 +0200, Ingo Molnar wrote:
> * Jason Low wrote:
>
> > > > With this patch set (along with commit 1018016c706f mentioned above),
> > > > the performance hit of itimers almost completely goes away on the
> > > > 16 so
On Tue, 2015-10-20 at 02:18 +0200, Frederic Weisbecker wrote:
> This way we might consume less space in the signal struct (well,
> depending on bool size or padding) and we don't need to worry about
> ordering between the running and checking_timers fields.
This looks fine to me. I ended up going
spent updating thread group cputimer timers was reduced
from 30% down to less than 1%.
Signed-off-by: Jason Low
---
include/linux/init_task.h |7 +++--
include/linux/sched.h | 12 +++--
kernel/fork.c |5 +---
kernel/sched/stats.h |
On Fri, 2015-01-23 at 10:33 +0100, Peter Zijlstra wrote:
> > + .running = ATOMIC_INIT(0), \
> > + atomic_t running;
> > + atomic_set(&sig->cputimer.running, 1);
> > @@ -174,7 +174,7 @@ static inline bool cputimer_running(struct task_struct
> > *ts
On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
> On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> > +static void update_gt_cputime(struct thread_group_cputimer *a, struct
> > task_cputime *b)
> > {
> > + if (b->u
On Fri, 2015-01-23 at 21:08 +0100, Peter Zijlstra wrote:
> On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
> > On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
> > > On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> > > > +static
On Fri, 2015-01-23 at 21:08 +0100, Peter Zijlstra wrote:
> On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
> > On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
> > > On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> > > > +static
that we can
focus on catching warnings that can potentially cause bigger issues.
Signed-off-by: Jason Low
---
kernel/cgroup.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index bb263d0..66684f3 100644
--- a/kernel/cgroup.c
+++ b/k
On Sun, 2015-01-25 at 23:36 -0800, Davidlohr Bueso wrote:
> When readers hold the semaphore, the ->owner is nil. As such,
> and unlike mutexes, '!owner' does not necessarily imply that
> the lock is free. This will cause writer spinners to potentially
> spin excessively as they've been mislead to t
On Tue, 2015-01-27 at 11:10 -0500, Tejun Heo wrote:
> On Mon, Jan 26, 2015 at 04:21:39PM -0800, Jason Low wrote:
> > Compiling kernel/ causes warnings:
> >
> > ... ‘root’ may be used uninitialized in this function
> > ... ‘root’ was declared here
> >
&g
possibility reader(s) may have the lock.
> - * To be safe, avoid spinning in these situations.
> - */
> - return on_cpu;
> + ret = owner->on_cpu;
> +done:
> + rcu_read_unlock();
> + return ret;
> }
Acked-by: Jason Low
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
) in the cpuset traversal.
Signed-off-by: Jason Low
---
kernel/cpuset.c | 12 +++-
1 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 64b257f..0f58c54 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -541,15 +541,17 @@ update_dom
On Tue, 2015-01-27 at 19:54 -0800, Davidlohr Bueso wrote:
> On Tue, 2015-01-27 at 09:23 -0800, Jason Low wrote:
> > On Sun, 2015-01-25 at 23:36 -0800, Davidlohr Bueso wrote:
> > > When readers hold the semaphore, the ->owner is nil. As such,
> > > and unlike mutexes,
On Mon, 2015-03-02 at 13:49 -0800, Jason Low wrote:
> On Mon, 2015-03-02 at 11:03 -0800, Linus Torvalds wrote:
> > On Mon, Mar 2, 2015 at 10:42 AM, Jason Low wrote:
> > >
> > > This patch converts the timers to 64 bit atomic variables and use
> > > atomic add
On Thu, 2015-03-19 at 10:59 -0700, Linus Torvalds wrote:
> On Thu, Mar 19, 2015 at 10:21 AM, Jason Low wrote:
> >
> > I tested this patch on a 32 bit ARM system with 4 cores. Using the
> > generic 64 bit atomics, I did not see any performance change with this
> > patch,
On Wed, 2015-04-01 at 14:03 +0100, Morten Rasmussen wrote:
Hi Morten,
> > Alright I see. But it is one additional wake up. And the wake up will be
> > within the cluster. We will not wake up any CPU in the neighboring
> > cluster unless there are tasks to be pulled. So, we can wake up a core
> >
On Tue, 2015-03-31 at 14:07 +0530, Preeti U Murthy wrote:
> On 03/31/2015 12:25 AM, Jason Low wrote:
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index fdae26e..ba8ec1a 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> >
On Wed, 2015-04-01 at 18:04 +0100, Morten Rasmussen wrote:
> On Wed, Apr 01, 2015 at 07:49:56AM +0100, Preeti U Murthy wrote:
> >
> > On 04/01/2015 12:24 AM, Jason Low wrote:
> > > On Tue, 2015-03-31 at 14:07 +0530, Preeti U Murthy wrote:
> > >> Hi Jason,
On Wed, 2015-04-01 at 18:04 +0100, Morten Rasmussen wrote:
> On Wed, Apr 01, 2015 at 07:49:56AM +0100, Preeti U Murthy wrote:
> > I am sorry I don't quite get this. Can you please elaborate?
>
> I think the scenario is that we are in nohz_idle_balance() and decide to
> bail out because we have pu
On Thu, 2015-04-02 at 10:17 +0100, Morten Rasmussen wrote:
> On Thu, Apr 02, 2015 at 06:59:07AM +0100, Jason Low wrote:
> > Also, below is an example patch.
> >
> > (Without the conversion to idle_cpu(), the check for rq->idle_balance
> > would not be accurate a
On Thu, 2015-04-23 at 14:24 -0400, Waiman Long wrote:
> The table below shows the % improvement in throughput (1100-2000 users)
> in the various AIM7's workloads:
>
> Workload% increase in throughput
Missing table here? :)
> ---
> include/linux/osq_lock.h|5 +++
> kernel/
This patchset improves the scalability of itimers, thread_group_cputimer
and addresses a performance issue we found while running a database
workload where more than 30% of total time is spent in the kernel
trying to acquire the thread_group_cputimer spinlock.
While we're modifying sched and timer
neric atomics and did not find the overhead to be much of an issue.
An explanation for why this isn't an issue is that 32 bit systems usually
have small numbers of CPUs, and cacheline contention from extra spinlocks
called periodically is not really apparent on smaller systems.
Signed-off-by:
3, enables it
after thread 1 checks !cputimer->running in thread_group_cputimer(), then
there is a possibility that update_gt_cputime() is updating the cputimers
while the cputimer is running.
This patch uses cmpxchg and retry logic to ensure that update_gt_cputime()
is making its updates atomically
ACCESS_ONCE doesn't work reliably on non-scalar types. This patch removes
the rest of the existing usages of ACCESS_ONCE in the scheduler, and use
the new READ_ONCE and WRITE_ONCE APIs.
Signed-off-by: Jason Low
---
include/linux/sched.h |4 ++--
kernel/f
Hi Steven,
On Tue, 2015-04-14 at 19:59 -0400, Steven Rostedt wrote:
> On Tue, 14 Apr 2015 16:09:44 -0700
> Jason Low wrote:
>
>
> > @@ -2088,7 +2088,7 @@ void task_numa_fault(int last_cpupid, int mem_node,
> > int pages, int flags)
> >
> > static void r
On Wed, 2015-04-15 at 09:35 +0200, Ingo Molnar wrote:
> * Ingo Molnar wrote:
>
> > So after your changes we still have a separate:
> >
> > struct task_cputime {
> > cputime_t utime;
> > cputime_t stime;
> > unsigned long long sum_exec_runtime;
> > };
> >
> > Which then w
On Wed, 2015-04-15 at 09:46 +0200, Ingo Molnar wrote:
> * Steven Rostedt wrote:
> > You are correct. Now I'm thinking that the WRITE_ONCE() is not needed,
> > and just a:
> >
> > p->mm->numa_scan_seq = READ_ONCE(p->numa_scan_seq) + 1;
> >
> > Can be done. But I'm still trying to wrap my hea
On Wed, 2015-04-15 at 16:07 +0530, Preeti U Murthy wrote:
> On 04/15/2015 04:39 AM, Jason Low wrote:
> > /*
> > @@ -885,11 +890,8 @@ static void check_thread_timers(struct task_struct
> > *tsk,
> > static void stop_process_timers(struct signal_struct
On Wed, 2015-04-15 at 15:32 +0200, Peter Zijlstra wrote:
> On Wed, Apr 15, 2015 at 03:25:36PM +0200, Frederic Weisbecker wrote:
> > On Tue, Apr 14, 2015 at 04:09:45PM -0700, Jason Low wrote:
> > > void thread_group_cputimer(struct task_struct *tsk, struct task_cpu
On Wed, 2015-04-15 at 07:23 -0700, Davidlohr Bueso wrote:
> On Tue, 2015-04-14 at 16:09 -0700, Jason Low wrote:
> > While running a database workload, we found a scalability issue with
> > itimers.
> >
> > Much of the problem was caused by the thread_group_cputimer
On Tue, 2015-04-14 at 22:40 -0400, Steven Rostedt wrote:
> You are correct. Now I'm thinking that the WRITE_ONCE() is not needed,
> and just a:
>
> p->mm->numa_scan_seq = READ_ONCE(p->numa_scan_seq) + 1;
Just to confirm, is this a typo? Because there really is a numa_scan_seq
in the task_st
Hi Ingo,
On Wed, 2015-04-15 at 09:46 +0200, Ingo Molnar wrote:
> * Steven Rostedt wrote:
> > You are correct. Now I'm thinking that the WRITE_ONCE() is not needed,
> > and just a:
> >
> > p->mm->numa_scan_seq = READ_ONCE(p->numa_scan_seq) + 1;
> >
> > Can be done. But I'm still trying to wr
On Thu, 2015-04-16 at 20:15 +0200, Peter Zijlstra wrote:
> On Thu, Apr 16, 2015 at 08:02:27PM +0200, Ingo Molnar wrote:
> > > ACCESS_ONCE() is not a compiler barrier
> >
> > It's not a general compiler barrier (and I didn't claim so) but it is
> > still a compiler barrier: it's documented as a we
On Thu, 2015-04-16 at 20:24 +0200, Ingo Molnar wrote:
> Would it make sense to add a few comments to the seq field definition
> site(s), about how it's supposed to be accessed - or to the
> READ_ONCE()/WRITE_ONCE() sites, to keep people from wondering?
How about this:
---
diff --git a/kernel/sc
1 - 100 of 490 matches
Mail list logo