Re: [PATCH] sched: wakeup buddy

2013-03-17 Thread Michael Wang
nchmarks still inert to the changes. I'm planning to make a new patch for this approach later, in which time_limit is a knob with the default value 1ms (usually the initial value of balance_interval and the value of min_interval), that will based on the latest tip tree. Regards, Michael Wang

Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state

2013-02-03 Thread Michael Wang
On 02/03/2013 01:50 AM, Sebastian Andrzej Siewior wrote: > On 01/31/2013 03:12 AM, Michael Wang wrote: >> I'm not sure, but just concern about this case: >> >> group 0 cpu 0 cpu 1 >> least idle 4 task

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
1% | 15 GB | 32 | 35988 | | 34025 | The reason may caused by wake_affine()'s higher overhead, and pgbench is really sensitive to this stuff... Regards, Michael Wang > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a m

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
rring whenever wake_affine() and pgbench appear in > the same sentence;) I saw the patch touched the wake_affine(), just interested on what will happen ;-) The patch changed the overhead of wake_affine(), and also influence it's result, I used to think the later one may do some help to the

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
On 04/02/2013 04:35 PM, Alex Shi wrote: > On 04/02/2013 03:23 PM, Michael Wang wrote: [snip] >> >> The reason may caused by wake_affine()'s higher overhead, and pgbench is >> really sensitive to this stuff... > > Thanks for testing. Could you like to remove the l

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
| 32 | 35988 | | 45749 | +27.12% Very nice improvement, I'd like to test it with the wake-affine throttle patch later, let's see what will happen ;-) Any idea on why the last one caused the regression? Regards, Michael Wang > -- To unsubscribe from this list: send the l

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
On 04/03/2013 10:56 AM, Alex Shi wrote: > On 04/03/2013 10:46 AM, Michael Wang wrote: >> | 15 GB | 16 | 45110 | | 48091 | >> | 15 GB | 24 | 41415 | | 47415 | >> | 15 GB | 32 | 35988 | | 45749 |+27.12% >> >> Very nice improvement,

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
prefer higher load if burst */ load = burst_prev ? target_load(prev_cpu, idx) : source_load(prev_cpu, idx); this_load = target_load(this_cpu, idx); Regards, Michael Wang > + } > > /* >* If sync wakeup then subtract the (maximum possible)

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
On 04/03/2013 01:38 PM, Michael Wang wrote: > On 04/03/2013 12:28 PM, Alex Shi wrote: > [snip] >> >> but the patch may cause some unfairness if this/prev cpu are not burst at >> same time. So could like try the following patch? > > I will try it later,

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-02 Thread Michael Wang
On 04/03/2013 12:28 PM, Alex Shi wrote: > On 04/03/2013 11:23 AM, Michael Wang wrote: >> On 04/03/2013 10:56 AM, Alex Shi wrote: >>> On 04/03/2013 10:46 AM, Michael Wang wrote: [snip] > > > From 4722a7567dccfb19aa5afbb49982ffb6d65e6ae5 Mon Sep 17 00:00:00 2001 >

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-03 Thread Michael Wang
On 04/03/2013 02:53 PM, Alex Shi wrote: > On 04/03/2013 02:22 PM, Michael Wang wrote: >>>> >>>> If many tasks sleep long time, their runnable load are zero. And if they >>>> are waked up bursty, too light runnable load causes big imbalance among >>

Re: [patch v3 0/8] sched: use runnable avg in load balance

2013-04-03 Thread Michael Wang
On 04/03/2013 04:46 PM, Alex Shi wrote: > On 04/02/2013 03:23 PM, Michael Wang wrote: >> | 15 GB | 12 | 45393 | | 43986 | >> | 15 GB | 16 | 45110 | | 45719 | >> | 15 GB | 24 | 41415 | | 36813 |-11.11% >> | 15 GB | 32 | 35988 | | 3

[RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-01-29 Thread Michael Wang
e. And that's come out the v3, no load balance for WAKE. Test with: 12 cpu X86 server and linux-next 3.8.0-rc3. Michael Wang (3): [RFC PATCH v3 1/3] sched: schedule balance map foundation [RFC PATCH v3 2/3] sched: build schedule balance map [RFC

[RFC PATCH v3 1/3] sched: schedule balance map foundation

2013-01-29 Thread Michael Wang
pu which support wake up on level l. This patch contain the foundation of schedule balance map in order to serve the follow patches. Signed-off-by: Michael Wang --- kernel/sched/core.c | 44 kernel/sched/sched.h | 14 ++ 2 files cha

[RFC PATCH v3 2/3] sched: build schedule balance map

2013-01-29 Thread Michael Wang
e lower sd. Signed-off-by: Michael Wang --- kernel/sched/core.c | 67 +++ 1 files changed, 67 insertions(+), 0 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 092c801..c2a13bc 100644 --- a/kernel/sched/core.c +++ b/kernel/

[RFC PATCH v3 3/3] sched: simplify select_task_rq_fair() with schedule balance map

2013-01-29 Thread Michael Wang
ned-off-by: Michael Wang --- kernel/sched/fair.c | 135 --- 1 files changed, 74 insertions(+), 61 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 5eea870..0935c7d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sc

Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state

2013-01-30 Thread Michael Wang
t; + return idle_group; Hi, Sebastian I'm not sure, but just concern about this case: group 0 cpu 0 cpu 1 least idle 4 task group 1 cpu 2 cpu 3 1 task

Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state

2013-01-30 Thread Michael Wang
On 01/31/2013 01:16 PM, Namhyung Kim wrote: > Hi Sebastian and Michael, > > On Thu, 31 Jan 2013 10:12:35 +0800, Michael Wang wrote: >> On 01/31/2013 05:19 AM, Sebastian Andrzej Siewior wrote: >>> If a new CPU has to be choosen for a task, then the scheduler first select

Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state

2013-01-30 Thread Michael Wang
On 01/31/2013 02:58 PM, Namhyung Kim wrote: > On Thu, 31 Jan 2013 14:39:20 +0800, Michael Wang wrote: >> On 01/31/2013 01:16 PM, Namhyung Kim wrote: >>> Anyway, I have an idea with this in mind. It's like adding a new "idle >>> load" to each idle cpu rat

Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state

2013-01-31 Thread Michael Wang
On 01/31/2013 03:40 PM, Namhyung Kim wrote: > On Thu, 31 Jan 2013 15:30:02 +0800, Michael Wang wrote: >> On 01/31/2013 02:58 PM, Namhyung Kim wrote: >>> But AFAIK the number of states in cpuidle is usually less than 10 so maybe >>> we can change the weight then, but ther

Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state

2013-01-31 Thread Michael Wang
On 01/31/2013 04:24 PM, Michael Wang wrote: > On 01/31/2013 03:40 PM, Namhyung Kim wrote: >> On Thu, 31 Jan 2013 15:30:02 +0800, Michael Wang wrote: >>> On 01/31/2013 02:58 PM, Namhyung Kim wrote: >>>> But AFAIK the number of states in cpuidle is usually less than

Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state

2013-01-31 Thread Michael Wang
On 01/31/2013 04:45 PM, Michael Wang wrote: > On 01/31/2013 04:24 PM, Michael Wang wrote: >> On 01/31/2013 03:40 PM, Namhyung Kim wrote: >>> On Thu, 31 Jan 2013 15:30:02 +0800, Michael Wang wrote: >>>> On 01/31/2013 02:58 PM, Namhyung Kim wrote: >>>>>

Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

2013-01-23 Thread Michael Wang
On 01/24/2013 02:01 PM, Michael Wang wrote: > On 01/23/2013 05:32 PM, Mike Galbraith wrote: > [snip] >> --- >> include/linux/topology.h |6 ++--- >> kernel/sched/core.c | 41 ++--- >> k

Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

2013-01-23 Thread Michael Wang
On 01/24/2013 02:51 PM, Mike Galbraith wrote: > On Thu, 2013-01-24 at 14:01 +0800, Michael Wang wrote: > >> I've enabled WAKE flag on my box like you did, but still can't see >> regression, and I've just tested on a power server with 64 cpu, also >> fail

Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

2013-01-24 Thread Michael Wang
On 01/24/2013 03:47 PM, Mike Galbraith wrote: > On Thu, 2013-01-24 at 15:15 +0800, Michael Wang wrote: >> On 01/24/2013 02:51 PM, Mike Galbraith wrote: >>> On Thu, 2013-01-24 at 14:01 +0800, Michael Wang wrote: >>> >>>> I've enabled WAKE flag

Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

2013-01-24 Thread Michael Wang
On 01/24/2013 05:07 PM, Mike Galbraith wrote: > On Thu, 2013-01-24 at 16:14 +0800, Michael Wang wrote: > >> Now it's time to work on v3 I think, let's see what we could get this time. > > Maybe v3 can try to not waste so much ram on affine map? Yeah, that has been a

Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

2013-01-24 Thread Michael Wang
On 01/24/2013 06:34 PM, Mike Galbraith wrote: > On Thu, 2013-01-24 at 17:26 +0800, Michael Wang wrote: >> On 01/24/2013 05:07 PM, Mike Galbraith wrote: >>> On Thu, 2013-01-24 at 16:14 +0800, Michael Wang wrote: >>> >>>> Now it's time to work on v3 I

[RFC PATCH] sched: wakeup buddy

2013-02-27 Thread Michael Wang
55343 | +31.19% | 15 GB | 32 | 35983 | | 55358 | +53.84% Signed-off-by: Michael Wang --- include/linux/sched.h |8 kernel/sched/fair.c | 97 - kernel/sysctl.c | 10 + 3 files changed, 113

Re: [RFC PATCH] sched: wakeup buddy

2013-02-27 Thread Michael Wang
Hi, Mike Thanks for your reply. On 02/28/2013 03:18 PM, Mike Galbraith wrote: > On Thu, 2013-02-28 at 14:38 +0800, Michael Wang wrote: > >> +/* >> + * current is the only task on rq and it is >> +

Re: [RFC PATCH] sched: wakeup buddy

2013-02-27 Thread Michael Wang
On 02/28/2013 03:40 PM, Michael Wang wrote: > Hi, Mike > > Thanks for your reply. > > On 02/28/2013 03:18 PM, Mike Galbraith wrote: >> On Thu, 2013-02-28 at 14:38 +0800, Michael Wang wrote: >> >>> + /* >>> +

Re: [RFC PATCH] sched: wakeup buddy

2013-02-28 Thread Michael Wang
On 02/28/2013 04:04 PM, Mike Galbraith wrote: > On Thu, 2013-02-28 at 15:40 +0800, Michael Wang wrote: >> Hi, Mike >> >> Thanks for your reply. >> >> On 02/28/2013 03:18 PM, Mike Galbraith wrote: >>> On Thu, 201

Re: [RFC PATCH] sched: wakeup buddy

2013-02-28 Thread Michael Wang
On 02/28/2013 04:24 PM, Mike Galbraith wrote: > On Thu, 2013-02-28 at 16:14 +0800, Michael Wang wrote: >> On 02/28/2013 04:04 PM, Mike Galbraith wrote: > >>> It would be nice if it _were_ a promise, but it is not, it's a hint. >> >> Bad to know :( >>

Re: sched: circular dependency between sched_domains_mutex and oom_notify_list

2013-02-18 Thread Michael Wang
g case. But I'm not sure why the log show the "sched_domains_mutex" as a target, so is your system really dead lock or it's just a fake report? Regards, Michael Wang > > [ 1039.634183] == > [ 1039.635717] [ INFO: possible circ

Re: [RFC] sched: The removal of idle_balance()

2013-02-18 Thread Michael Wang
've done something wrong while distributing tasks among the > CPUs, that indicates a problem during fork/exec/wake balancing? Hmm...I think, unless we have the promise that all those threads, at any moment, they have the same behaviour, otherwise, even each cpu has the same load, there are still

Re: sched: circular dependency between sched_domains_mutex and oom_notify_list

2013-02-19 Thread Michael Wang
On 02/19/2013 12:48 PM, Michael Wang wrote: > On 02/17/2013 01:42 PM, Sasha Levin wrote: >> Hi all, >> >> I was fuzzing with trinity inside a KVM tools guest, with today's -next >> kernel >> when I've hit the following spew. >> >> I suspec

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-20 Thread Michael Wang
ation to be able to enact plan-B. The benefit comes from avoiding unnecessary works, and the patch set is suppose to only reduce the cost of key function with least logical changing, I could not promise it benefit all the workloads, but till now, I've not found regression. Regards, Michael Wan

Re: [RFC PATCH v3 1/3] sched: schedule balance map foundation

2013-02-20 Thread Michael Wang
On 02/20/2013 09:21 PM, Peter Zijlstra wrote: > On Tue, 2013-01-29 at 17:09 +0800, Michael Wang wrote: >> + for_each_possible_cpu(cpu) { >> + sbm = &per_cpu(sbm_array, cpu); >> + node = cpu_to_node(cpu); >> + si

Re: [RFC PATCH v3 1/3] sched: schedule balance map foundation

2013-02-20 Thread Michael Wang
On 02/20/2013 09:25 PM, Peter Zijlstra wrote: > On Tue, 2013-01-29 at 17:09 +0800, Michael Wang wrote: >> +struct sched_balance_map { >> + struct sched_domain **sd[SBM_MAX_TYPE]; >> + int top_level[SBM_MAX_TYPE]; >> + struct sched_domain *affine_map

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-20 Thread Michael Wang
not explain this point in cover, but it's really not a big deal in my opinion... And I'm going to apply Mike's suggestion, do allocation when cpu active, that will save some space :) Regards, Michael Wang > >> any ideas exactly *why* it speeds up? > > That is

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-20 Thread Michael Wang
ncing discussion we already noted that the >> find_idlest_goo() is in need of attention. > > Yup, even little stuff like break off the search when load is zero.. Agree, searching in a bunch of idle cpus and their subsets doesn't make sense... Regards, Michael Wang > unles

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-20 Thread Michael Wang
On 02/21/2013 02:11 PM, Mike Galbraith wrote: > On Thu, 2013-02-21 at 12:51 +0800, Michael Wang wrote: >> On 02/20/2013 06:49 PM, Ingo Molnar wrote: >> [snip] [snip] >> >> if wake_affine() >> new_cpu = select_idle_sibling(curr_cpu) >

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-21 Thread Michael Wang
On 02/21/2013 04:10 PM, Mike Galbraith wrote: > On Thu, 2013-02-21 at 15:00 +0800, Michael Wang wrote: >> On 02/21/2013 02:11 PM, Mike Galbraith wrote: >>> On Thu, 2013-02-21 at 12:51 +0800, Michael Wang wrote: >>>> On 02/20/2013 06:49 PM, Ingo Molnar wrote: >>

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-21 Thread Michael Wang
On 02/21/2013 04:10 PM, Mike Galbraith wrote: > On Thu, 2013-02-21 at 15:00 +0800, Michael Wang wrote: >> On 02/21/2013 02:11 PM, Mike Galbraith wrote: >>> On Thu, 2013-02-21 at 12:51 +0800, Michael Wang wrote: >>>> On 02/20/2013 06:49 PM, Ingo Molnar wrote: >>

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-21 Thread Michael Wang
On 02/21/2013 05:43 PM, Mike Galbraith wrote: > On Thu, 2013-02-21 at 17:08 +0800, Michael Wang wrote: > >> But is this patch set really cause regression on your Q6600? It may >> sacrificed some thing, but I still think it will benefit far more, >> especially on huge sy

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-21 Thread Michael Wang
On 02/21/2013 06:20 PM, Peter Zijlstra wrote: > On Thu, 2013-02-21 at 12:51 +0800, Michael Wang wrote: >> The old logical when locate affine_sd is: >> >> if prev_cpu != curr_cpu >> if wake_affine() >> prev_

Re: [RFC PATCH v3 1/3] sched: schedule balance map foundation

2013-02-21 Thread Michael Wang
On 02/21/2013 07:37 PM, Peter Zijlstra wrote: > On Thu, 2013-02-21 at 12:58 +0800, Michael Wang wrote: >> >> You are right, it cost space in order to accelerate the system, I've >> calculated the cost once before (I'm really not good at this, please >&g

Re: [PATCH] sched: Skip looking at skip if next or last is set

2013-02-21 Thread Michael Wang
g to look at scheduler ;-) Actually I give up this idea since I missed one point that the code will be optimized by the compiler, and usually it will become some logical we could not image. My patch is correct logically, but it may not benefit scheduler a lot, I don't think there wil

Re: [RFC PATCH v3 1/3] sched: schedule balance map foundation

2013-02-21 Thread Michael Wang
On 02/22/2013 11:33 AM, Alex Shi wrote: > On 02/22/2013 10:53 AM, Michael Wang wrote: >>>> >>>>>> And the final cost is 3000 int and 103 pointer, and some padding, >>>>>> but won't bigger than 10M, not a big deal for a system with 1000 c

Re: [RFC PATCH v3 1/3] sched: schedule balance map foundation

2013-02-21 Thread Michael Wang
On 02/22/2013 12:46 PM, Alex Shi wrote: > On 02/22/2013 12:19 PM, Michael Wang wrote: >> >>>> Why not seek other way to change O(n^2) to O(n)? >>>> >>>> Access 2G memory is unbelievable performance cost. >> Not access 2G memory, but (2G / 16K) me

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-21 Thread Michael Wang
On 02/22/2013 01:02 PM, Mike Galbraith wrote: > On Fri, 2013-02-22 at 10:36 +0800, Michael Wang wrote: >> On 02/21/2013 05:43 PM, Mike Galbraith wrote: >>> On Thu, 2013-02-21 at 17:08 +0800, Michael Wang wrote: >>> >>>> But is this patch set really

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-21 Thread Michael Wang
On 02/22/2013 01:08 PM, Mike Galbraith wrote: > On Fri, 2013-02-22 at 10:37 +0800, Michael Wang wrote: > >> According to the testing result, I could not agree this purpose of >> wake_affine() benefit us, but I'm sure that wake_affine() is a terrible >> performan

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-21 Thread Michael Wang
On 02/22/2013 01:02 PM, Mike Galbraith wrote: > On Fri, 2013-02-22 at 10:36 +0800, Michael Wang wrote: >> On 02/21/2013 05:43 PM, Mike Galbraith wrote: >>> On Thu, 2013-02-21 at 17:08 +0800, Michael Wang wrote: >>> >>>> But is this patch set really

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-22 Thread Michael Wang
On 02/22/2013 04:17 PM, Mike Galbraith wrote: > On Fri, 2013-02-22 at 14:42 +0800, Michael Wang wrote: > >> So this is trying to take care the condition when curr_cpu(local) and >> prev_cpu(remote) are on different nodes, which in the old world, >> wake_affine() w

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-22 Thread Michael Wang
On 02/22/2013 04:21 PM, Peter Zijlstra wrote: > On Fri, 2013-02-22 at 10:36 +0800, Michael Wang wrote: >> According to my understanding, in the old world, wake_affine() will >> only >> be used if curr_cpu and prev_cpu share cache, which means they are in >> one package

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-22 Thread Michael Wang
On 02/22/2013 04:36 PM, Peter Zijlstra wrote: > On Fri, 2013-02-22 at 10:37 +0800, Michael Wang wrote: >> But that's really some benefit hardly to be estimate, especially when >> the workload is heavy, the cost of wake_affine() is very high to >> calculated se one by o

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-22 Thread Michael Wang
On 02/22/2013 05:39 PM, Peter Zijlstra wrote: > On Fri, 2013-02-22 at 17:10 +0800, Michael Wang wrote: >> On 02/22/2013 04:21 PM, Peter Zijlstra wrote: >>> On Fri, 2013-02-22 at 10:36 +0800, Michael Wang wrote: >>>> According to my understanding, in the old world

Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

2013-02-22 Thread Michael Wang
On 02/22/2013 05:57 PM, Peter Zijlstra wrote: > On Fri, 2013-02-22 at 17:11 +0800, Michael Wang wrote: > >> Ok, it do looks like wake_affine() lost it's value... > > I'm not sure we can say that on this one benchmark, there's a > preemption advantage to runn

Re: [PATCH] sched: wakeup buddy

2013-03-12 Thread Michael Wang
, I think we still need a knob finally, since it doesn't sounds like a general optimization which benefit all the cases. And I don't agree to remove the stuff since we have many theories that this could benefit us, but before it really show the benefit in all the cases, provide a way to keep i

Re: [PATCH] sched: wakeup buddy

2013-03-14 Thread Michael Wang
On 03/14/2013 06:58 PM, Peter Zijlstra wrote: > On Wed, 2013-03-13 at 11:07 +0800, Michael Wang wrote: > >> However, we already figure out the logical that wakeup related task >> could benefit from closely running, this could promise us somewhat >> reliable benefit. >

Re: [PATCH] sched: unify the check on atomic sleeping in __might_sleep() and schedule_bug()

2012-09-02 Thread Michael Wang
On 08/22/2012 10:40 AM, Michael Wang wrote: > From: Michael Wang > > Fengguang Wu has reported the bug: > > [0.043953] BUG: scheduling while atomic: swapper/0/1/0x1002 > [0.044017] no locks held by swapper/0/1. > [0.044692] Pid: 1, comm: swapper/0 Not ta

[RFC PATCH 0/4] linsched: fix issues to make the test results more accurately

2012-09-02 Thread Michael Wang
From: Michael Wang This patch set fix several issues in linsched, help it to behave more close to the real world, in order to make the test results more accurately. Signed-off-by: Michael Wang --- b/arch/linsched/kernel/irq.c |4 b/tools/linsched/hrtimer.c |1 - b/tools

[RFC PATCH 1/4] linsched: remove process_all_softirqs() in main loop for accuracy

2012-09-02 Thread Michael Wang
From: Michael Wang process_all_softirqs() will handle softirq for all the cpu even it's not the right timing for them, this will cause inaccuracy. This patch stop invoke process_all_softirqs(), so softirq will only be handled after timer interrupt arrived. Signed-off-by: Michael

[RFC PATCH 3/4] linsched: avoid invoke tick_nohz_idle_enter() multiple times in idle

2012-09-02 Thread Michael Wang
From: Michael Wang In real world, tick_nohz_idle_enter() will be invoked by idle thread when the cpu change from active to idle, and will only be invoked again after tick_nohz_idle_exit() was invoked by idle thread when cpu is going to recover, invoke it multiple times in one idle may cause

[RFC PATCH 2/4] linsched: add check on invoke tick_nohz_irq_exit() in irq_exit()

2012-09-02 Thread Michael Wang
From: Michael Wang tick_nohz_irq_exit() will make sure the tick timer reprogram correctly after cpu enter idle. With out this check, after the interrupt, tick timer will be enabled even cpu is still in idle, this will cause inaccuracy. Signed-off-by: Michael Wang --- arch/linsched/kernel

[RFC PATCH 4/4] linsched: add the simulation of schedule after ipi interrupt

2012-09-02 Thread Michael Wang
From: Michael Wang In real world of x86, during an interrupt, if current thread need to be reschedule, we will do it after invoke do_IRQ. And in linsched, while handle the softirq, it may cause the reschedule ipi on other cpu, so we need to do schedule for them at that time, otherwise we will

Re: rcu_bh stalls on 3.2.28

2012-09-02 Thread Michael Wang
t before we start check stall for this gp, but the INFO show that we have a current jiffies which bigger then rsp->jiffies_stall but equal to rsp->gp_start, really strange... Could you please have a try on the latest kernel and confirm whether this issue still exist? BTW: Is th

[PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-04 Thread Michael Wang
From: Michael Wang DEADLOCK will be report while running a kernel with NUMA and LOCKDEP enabled, the process of this fake report is: kmem_cache_free()//free obj in cachep -> cache_free_alien() //acquire cachep's l3 alien lock -> __drain_

Re: WARNING: cpu_is_offline() at native_smp_send_reschedule()

2012-09-04 Thread Michael Wang
current cpu should do kick , and the first condition we need to match is that current cpu should be idle, but the trace show current pid is 88 not 0. We should add Peter to cc list, may be he will be interested on what happened. Regards, Michael Wang > [ 10.987506] [<7905fdad>

Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-10 Thread Michael Wang
On 09/08/2012 04:39 PM, Pekka Enberg wrote: > On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney > wrote: >> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote: >>> On 09/05/2012 09:55 PM, Christoph Lameter wrote: >>>> On Wed, 5 Sep 2012, Michael Wang w

Re: [PATCH 1/3] raid: replace list_for_each_continue_rcu with new interface

2012-09-10 Thread Michael Wang
On 09/11/2012 02:21 PM, NeilBrown wrote: > On Mon, 10 Sep 2012 16:30:11 +0800 Michael Wang > wrote: > >> On 08/24/2012 08:51 AM, Michael Wang wrote: >>> On 08/17/2012 12:33 PM, Michael Wang wrote: >>>> From: Michael Wang >>>> >>&g

Re: WARNING: at kernel/rcutree.c:1558 rcu_do_batch+0x386/0x3a0(), during CPU hotplug

2012-09-12 Thread Michael Wang
Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 300aba6..84a6f55 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -1892,6 +1892,8 @@ static void rcu_process_callbacks(struct softirq_action *unused)

Re: [PATCH] sched: unify the check on atomic sleeping in __might_sleep() and schedule_bug()

2012-09-13 Thread Michael Wang
On 09/03/2012 10:16 AM, Michael Wang wrote: > On 08/22/2012 10:40 AM, Michael Wang wrote: >> From: Michael Wang >> >> Fengguang Wu has reported the bug: >> >> [0.043953] BUG: scheduling while atomic: swapper/0/1/0x1002 >> [0.044017] no locks hel

Re: [PATCH] sched: unify the check on atomic sleeping in __might_sleep() and schedule_bug()

2012-09-13 Thread Michael Wang
On 09/13/2012 06:04 PM, Peter Zijlstra wrote: > On Wed, 2012-08-22 at 10:40 +0800, Michael Wang wrote: >> From: Michael Wang >> >> Fengguang Wu has reported the bug: >> >> [0.043953] BUG: scheduling while atomic: swapper/0/1/0x1002 >> [

Re: WARNING: at kernel/rcutree.c:1558 rcu_do_batch+0x386/0x3a0(), during CPU hotplug

2012-09-13 Thread Michael Wang
On 09/13/2012 08:47 PM, Srivatsa S. Bhat wrote: > On 09/13/2012 12:00 PM, Michael Wang wrote: >> On 09/12/2012 11:31 PM, Paul E. McKenney wrote: >>> On Wed, Sep 12, 2012 at 06:06:20PM +0530, Srivatsa S. Bhat wrote: >>>> On 07/19/2012 10:45 PM, Paul E. McKenney wrote:

Re: [RFC][PATCH] sched: Fix a deadlock of cpu-hotplug

2012-10-24 Thread Michael Wang
k at how __stop_machine() calls the function with IRQs disabled for ! > stop_machine_initialized or !SMP. Also stop_machine_cpu_stop() seems to > disabled interrupts, so how do we end up calling take_cpu_down() with > IRQs enabled? The patch is no doubt wrong... The discuss in: https://lkml

The idea about scheduler test module(STM)

2012-10-24 Thread Michael Wang
t by module param. I will be appreciate if I could get some feedback from the scheduler experts like you, whatever you think it's good or junk, please let me know :) Regards, Michael Wang play.sh: DURATION=10 NORMAL_THREADS=24 PERIOD=10 make clean make insmod ./schedtm.ko normalnr=$NORMAL_TH

Re: The idea about scheduler test module(STM)

2012-10-25 Thread Michael Wang
way could give. > Maybe you should show us ur better examples. :) I'd like to make it more useful not just a demo, but I need more feedback and suggestions :) Regards, Michael Wang > > Regards, > Charles > > On 10/25/2012 01:40 PM, Michael Wang wrote: >> Hi, F

Re: [PATCH v3] epoll: Support for disabling items, and a self-test app.

2012-10-31 Thread Michael Wang
On 11/01/2012 02:57 AM, Paton J. Lewis wrote: > On 10/30/12 11:32 PM, Michael Wang wrote: >> On 10/26/2012 08:08 AM, Paton J. Lewis wrote: >>> From: "Paton J. Lewis" >>> >>> It is not currently possible to reliably delete epoll items when >>

Re: [PATCH] slab: annotate on-slab caches nodelist locks

2012-11-01 Thread Michael Wang
low me to ask few questions: 1. what's scene will cause the fake dead lock? 2. what's the conflict caches? 3. how does their lock operation nested? And I think it will be better if we have the bug log in patch comment, so folks will easily know what's the reason we need this pa

Re: [PATCH] slab: annotate on-slab caches nodelist locks

2012-11-01 Thread Michael Wang
On 11/02/2012 12:48 AM, Glauber Costa wrote: > On 11/01/2012 11:11 AM, Michael Wang wrote: >> On 10/29/2012 06:49 PM, Glauber Costa wrote: >>> We currently provide lockdep annotation for kmalloc caches, and also >>> caches that have SLAB_DEBUG_OBJECTS enabled. The rea

Re: [PATCH v3] epoll: Support for disabling items, and a self-test app.

2012-11-01 Thread Michael Wang
On 11/02/2012 02:47 AM, Paton J. Lewis wrote: > On 10/31/12 5:43 PM, Michael Wang wrote: >> On 11/01/2012 02:57 AM, Paton J. Lewis wrote: >>> On 10/30/12 11:32 PM, Michael Wang wrote: >>>> On 10/26/2012 08:08 AM, Paton J. Lewis wrote: >>>>> From: &quo

Re: [PATCH v2] epoll: Support for disabling items, and a self-test app.

2012-11-04 Thread Michael Wang
lete thread: 1. set fd stop flag(user flag) //tell worker don't use fd any more 2. epoll_ctl(DISABLE) 3. if return BUSY, try later //rcu_syn 3. else, do delete worker thread: 1. invoke epoll_wait()

Re: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks on v3.6

2012-07-25 Thread Michael Wang
On 07/26/2012 05:16 AM, Sasha Levin wrote: > On 07/25/2012 10:36 AM, Michael Wang wrote: >> On 07/25/2012 01:10 AM, Sasha Levin wrote: >>> Hi all, >>> >>> I was fuzzing with trinity inside a KVM tools guest, on the current 3.6, >>> and stumble

Re: [PATCH] sched: fix should_resched() to avoid do schedule in atomic

2012-09-25 Thread Michael Wang
On 09/18/2012 11:13 AM, Michael Wang wrote: > This patch try to fix the BUG: > > [0.043953] BUG: scheduling while atomic: swapper/0/1/0x1002 > [0.044017] no locks held by swapper/0/1. > [0.044692] Pid: 1, comm: swapper/0 Not tainted 3.6.0-rc1-00420-gb7aebb9 > #

Re: [PATCH] sched: rewrite the wrong annotation for select_task_rq_fair()

2012-09-25 Thread Michael Wang
On 09/18/2012 04:16 PM, Michael Wang wrote: > The annotation for select_task_rq_fair() is wrong since commit c88d5910, it's > actually for a removed function. > > This patch rewrite the wrong annotation to make it correct. Could I get some comments on this patch? Rega

Re: [PATCH] x86: remove the useless branch in c_start()

2012-09-25 Thread Michael Wang
On 09/19/2012 01:42 PM, Michael Wang wrote: > Since 'cpu == -1' in cpumask_next() is legal, no need to handle '*pos == 0' > specially. > > About the comments: > /* just in case, cpu 0 is not the first */ > A test with a cpumask in which cpu 0 is not t

Re: [PATCH v3] epoll: Support for disabling items, and a self-test app.

2012-10-30 Thread Michael Wang
nt is not 0 wait until ref count be 0 So after DISABLE return, we can safely delete any thing related to that epi. One thing is that the user should not change the events info returned by epoll_wait(). It's just a propose, but if it works, there will be no limit on ONESHOT any

Re: [PATCH] sched: smart wake-affine

2013-07-08 Thread Michael Wang
On 07/08/2013 04:21 PM, Peter Zijlstra wrote: > On Sun, Jul 07, 2013 at 08:43:25AM +0200, Mike Galbraith wrote: >> On Fri, 2013-07-05 at 14:16 +0800, Michael Wang wrote: >> >>> PeterZ has suggested some optimization which I sent out yesterday, I >>> suppose the

Re: [PATCH] sched: smart wake-affine

2013-07-08 Thread Michael Wang
On 07/08/2013 04:49 PM, Mike Galbraith wrote: > On Mon, 2013-07-08 at 10:21 +0200, Peter Zijlstra wrote: >> On Sun, Jul 07, 2013 at 08:43:25AM +0200, Mike Galbraith wrote: >>> On Fri, 2013-07-05 at 14:16 +0800, Michael Wang wrote: >>> >>>> PeterZ has sugge

Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-08 Thread Michael Wang
lockdep_set_class(&j_cdbs->timer_mutex, &j_cdbs_key); + INIT_DEFERRABLE_WORK(&j_cdbs->work, dbs_data->cdata->gov_dbs_timer); } Regards, Michael Wang > > Best regards, > -- > Bartlomiej Zolnierkie

Re: [PATCH] sched: smart wake-affine

2013-07-08 Thread Michael Wang
not on the wrong way... HT here means hyperthreading, correct? I have some questions like: 1. how do you disable the hyperthreading? by manual or some other way? 2. is the 3.10-rc5 in image also disabled the hyperthreading? 3. is the v3 patch set show the same issue? Regards, Michael Wang > &g

Re: [PATCH] sched: smart wake-affine

2013-07-08 Thread Michael Wang
overhead, should make things better, especially when workload is high and platform is big (your box is really what I desired ;-), honestly). And if it is possible, comparison based on the same basement will be better :) Regards, Michael Wang > > Thanks, > Davidlohr > -- To unsubscribe

Re: [PATCH v3 1/2] sched: smart wake-affine foundation

2013-07-09 Thread Michael Wang
On 07/10/2013 09:52 AM, Sam Ben wrote: > On 07/08/2013 10:36 AM, Michael Wang wrote: >> Hi, Sam >> >> On 07/07/2013 09:31 AM, Sam Ben wrote: >>> On 07/04/2013 12:55 PM, Michael Wang wrote: >>>> wake-affine stuff is always trying to pull wakee close to w

Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Michael Wang
is already > held by the first thread and deadlocks. Hmm...I think I get your point, some thread hold the lock and flush some work which also try to hold the same lock, correct? Ok, that's a problem, let's figure out a good way to solve it :) Regards, Michael Wang > > B

Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Michael Wang
x27;s > work item fires and queues the work item on CPU4 as well as CPU3. Thus, > gov_cancel_work() _effectively_ didn't do anything useful. That's interesting, sense like a little closer to the root, the timer is supposed to stop but failed... I need some investigation here... Re

Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Michael Wang
. But if the event is just to sync the queued work but not prevent follow work happen, then things will become tough...we need confirm. What's your opinion? Regards, Michael Wang > > Also, you might perhaps want to try the (untested) patch shown below, and > see if it resolve

Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Michael Wang
On 07/10/2013 01:39 PM, Viresh Kumar wrote: > On 10 July 2013 09:42, Michael Wang wrote: >> I'm not sure what is supposed after notify CPUFREQ_GOV_STOP event, if it >> is in order to stop queued work and prevent follow work happen again, >> then it failed to, and we need

Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-10 Thread Michael Wang
On 07/10/2013 10:40 AM, Michael Wang wrote: > On 07/09/2013 07:51 PM, Bartlomiej Zolnierkiewicz wrote: > [snip] >> >> It doesn't help and unfortunately it just can't help as it only >> addresses lockdep functionality while the issue is not a lockdep >> pr

Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

2013-07-10 Thread Michael Wang
) is supposed to never happen after CPUFREQ_GOV_STOP notify, the whole policy should stop working at that time. But it failed to, and the work concurrent with cpu dying caused the first problem. Thus I think we should focus on this and suggested below fix, I'd like to know your opinions :)

Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

2013-07-11 Thread Michael Wang
any point we missed. And we should also thanks Srivatsa for catching the root issue ;-) Regards, Michael Wang > > > -ss > >> Regards, >> Michael Wang >> >> diff --git a/drivers/cpufreq/cpufreq_governor.c >> b/drivers/cpufreq/cpufreq_

  1   2   3   4   5   6   7   8   9   >