[patch 41/45] sched: fix high wake up latencies with FAIR_USER_SCHED
2.6.24-stable review patch. If anyone has any objections, please let us know. -- From: Srivatsa Vaddagiri <[EMAIL PROTECTED]> patch 296825cbe14d4c95ee9c41ca5824f7487bfb4d9d in mainline. The reason why we are getting better wakeup latencies for !FAIR_USER_SCHED is because of this snippet of code in place_entity(): if (!initial) { /* sleeps upto a single latency don't count. */ if (sched_feat(NEW_FAIR_SLEEPERS) && entity_is_task(se)) ^^ vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */ vruntime = max_vruntime(se->vruntime, vruntime); } NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and not group-level entities. With the patch attached, I could see that wakeup latencies with FAIR_USER_SCHED are restored to the same level as !FAIR_USER_SCHED. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- kernel/sched_fair.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -511,7 +511,7 @@ place_entity(struct cfs_rq *cfs_rq, stru if (!initial) { /* sleeps upto a single latency don't count. */ - if (sched_feat(NEW_FAIR_SLEEPERS) && entity_is_task(se)) + if (sched_feat(NEW_FAIR_SLEEPERS)) vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */ -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 41/45] sched: fix high wake up latencies with FAIR_USER_SCHED
2.6.24-stable review patch. If anyone has any objections, please let us know. -- From: Srivatsa Vaddagiri [EMAIL PROTECTED] patch 296825cbe14d4c95ee9c41ca5824f7487bfb4d9d in mainline. The reason why we are getting better wakeup latencies for !FAIR_USER_SCHED is because of this snippet of code in place_entity(): if (!initial) { /* sleeps upto a single latency don't count. */ if (sched_feat(NEW_FAIR_SLEEPERS) entity_is_task(se)) ^^ vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */ vruntime = max_vruntime(se-vruntime, vruntime); } NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and not group-level entities. With the patch attached, I could see that wakeup latencies with FAIR_USER_SCHED are restored to the same level as !FAIR_USER_SCHED. Signed-off-by: Ingo Molnar [EMAIL PROTECTED] Signed-off-by: Greg Kroah-Hartman [EMAIL PROTECTED] --- kernel/sched_fair.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -511,7 +511,7 @@ place_entity(struct cfs_rq *cfs_rq, stru if (!initial) { /* sleeps upto a single latency don't count. */ - if (sched_feat(NEW_FAIR_SLEEPERS) entity_is_task(se)) + if (sched_feat(NEW_FAIR_SLEEPERS)) vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */ -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Thu, 2008-01-31 at 13:49 +0100, Guillaume Chazarain wrote: > On 1/31/08, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > Does something like this help? > > I made it compile by open coding undefined macros instead of > refactoring the whole file. > But it didn't affect wake up latencies. Ah, well, what can one expect from mid-night ideas :-) Thanks for testing! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On 1/31/08, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > Does something like this help? I made it compile by open coding undefined macros instead of refactoring the whole file. But it didn't affect wake up latencies. Thanks. -- Guillaume -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Mon, 2008-01-28 at 21:13 +0100, Guillaume Chazarain wrote: > Unfortunately it seems to not be completely fixed, with this script: > > #!/usr/bin/python > > import os > import time > > SLEEP_TIME = 0.1 > SAMPLES = 5 > PRINT_DELAY = 0.5 > > def print_wakeup_latency(): > times = [] > last_print = 0 > while True: > start = time.time() > time.sleep(SLEEP_TIME) > end = time.time() > times.insert(0, end - start - SLEEP_TIME) > del times[SAMPLES:] > if end > last_print + PRINT_DELAY: > copy = times[:] > copy.sort() > print '%f ms' % (copy[len(copy)/2] * 1000) > last_print = end > > if os.fork() == 0: > if os.fork() == 0: > os.setuid(1) > while True: > pass > else: > os.setuid(2) > while True: > pass > else: > os.setuid(1) > print_wakeup_latency() > > I get seemingly unpredictable latencies (with or without the patch applied): > > # ./sched.py > 14.810944 ms > 19.829893 ms > 1.968050 ms > 8.021021 ms > -0.017977 ms > 4.926109 ms > 11.958027 ms > 5.995893 ms > 1.992130 ms > 0.007057 ms > 0.217819 ms > -0.004864 ms > 5.907202 ms > 6.547832 ms > -0.012970 ms > 0.209951 ms > -0.002003 ms > 4.989052 ms > > Without FAIR_USER_SCHED, latencies are consistently in the noise. > Also, I forgot to mention that I'm on a single CPU. > > Thanks for the help. Does something like this help? Index: linux-2.6/kernel/sched_fair.c === --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -267,8 +267,12 @@ static u64 sched_slice(struct cfs_rq *cf { u64 slice = __sched_period(cfs_rq->nr_running); - slice *= se->load.weight; - do_div(slice, cfs_rq->load.weight); + for_each_sched_entity(se) { + cfs_rq = cfs_rq_of(se); + + slice *= se->load.weight; + do_div(slice, cfs_rq->load.weight); + } return slice; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Mon, 2008-01-28 at 21:13 +0100, Guillaume Chazarain wrote: Unfortunately it seems to not be completely fixed, with this script: #!/usr/bin/python import os import time SLEEP_TIME = 0.1 SAMPLES = 5 PRINT_DELAY = 0.5 def print_wakeup_latency(): times = [] last_print = 0 while True: start = time.time() time.sleep(SLEEP_TIME) end = time.time() times.insert(0, end - start - SLEEP_TIME) del times[SAMPLES:] if end last_print + PRINT_DELAY: copy = times[:] copy.sort() print '%f ms' % (copy[len(copy)/2] * 1000) last_print = end if os.fork() == 0: if os.fork() == 0: os.setuid(1) while True: pass else: os.setuid(2) while True: pass else: os.setuid(1) print_wakeup_latency() I get seemingly unpredictable latencies (with or without the patch applied): # ./sched.py 14.810944 ms 19.829893 ms 1.968050 ms 8.021021 ms -0.017977 ms 4.926109 ms 11.958027 ms 5.995893 ms 1.992130 ms 0.007057 ms 0.217819 ms -0.004864 ms 5.907202 ms 6.547832 ms -0.012970 ms 0.209951 ms -0.002003 ms 4.989052 ms Without FAIR_USER_SCHED, latencies are consistently in the noise. Also, I forgot to mention that I'm on a single CPU. Thanks for the help. Does something like this help? Index: linux-2.6/kernel/sched_fair.c === --- linux-2.6.orig/kernel/sched_fair.c +++ linux-2.6/kernel/sched_fair.c @@ -267,8 +267,12 @@ static u64 sched_slice(struct cfs_rq *cf { u64 slice = __sched_period(cfs_rq-nr_running); - slice *= se-load.weight; - do_div(slice, cfs_rq-load.weight); + for_each_sched_entity(se) { + cfs_rq = cfs_rq_of(se); + + slice *= se-load.weight; + do_div(slice, cfs_rq-load.weight); + } return slice; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On 1/31/08, Peter Zijlstra [EMAIL PROTECTED] wrote: Does something like this help? I made it compile by open coding undefined macros instead of refactoring the whole file. But it didn't affect wake up latencies. Thanks. -- Guillaume -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Thu, 2008-01-31 at 13:49 +0100, Guillaume Chazarain wrote: On 1/31/08, Peter Zijlstra [EMAIL PROTECTED] wrote: Does something like this help? I made it compile by open coding undefined macros instead of refactoring the whole file. But it didn't affect wake up latencies. Ah, well, what can one expect from mid-night ideas :-) Thanks for testing! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Tue, Jan 29, 2008 at 04:53:56PM +0100, Guillaume Chazarain wrote: > I just thought about something to restore low latencies with > FAIR_GROUP_SCHED, but it's possibly utter nonsense, so bear with me > ;-) The idea would be to reverse the trees upside down. The scheduler > would only see tasks (on the leaves) so could apply its interactivity > magic, but the hierarchical groups would be used to compute dynamic > loads for each task according to their position in the tree: I think this is equivalent to flattening the hierarchy? We discussed this a bit sometime back [1], but one of its weaknesses is providing strong partitioning between groups when it comes to ensuring fairness. Ex: imagine a group which does a fork-bomb. With the flattened tree, it affects other groups more than it would with a 1-level deep hierarchy. Having said that, I would be interested to hear other solutions that maintain this good partitioning b/n groups and still provide good interactivity! 1. http://lkml.org/lkml/2007/5/30/300 > - now: > - we schedule each level of the tree starting from the root > > - with my proposition: > - we schedule tasks like with !FAIR_GROUP_SCHED, but > calc_delta_fair() would traverse the tree starting from the leaves to > compute the dynamic load. -- Regards, vatsa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Jan 29, 2008 6:47 AM, Srivatsa Vaddagiri <[EMAIL PROTECTED]> wrote: > IMHO this is expected results and if someone really needs to cut down > this latency, they can reduce sysctl_sched_latency (which will be bad > from perf standpoint, as we will cause more cache thrashing with that). Thank you very much for the detailed explanation Srivatsa, that made a lot of sense. Unfortunately, it means I'll disable FAIR_USER_SCHED as I initially thought these latencies were caused by my local patches that give each group a load proportional to the max load of its elements. Anyway, I don't absolutely need a fair user scheduler on my laptop, but low latencies in the default configuration are nice to have. I just thought about something to restore low latencies with FAIR_GROUP_SCHED, but it's possibly utter nonsense, so bear with me ;-) The idea would be to reverse the trees upside down. The scheduler would only see tasks (on the leaves) so could apply its interactivity magic, but the hierarchical groups would be used to compute dynamic loads for each task according to their position in the tree: - now: - we schedule each level of the tree starting from the root - with my proposition: - we schedule tasks like with !FAIR_GROUP_SCHED, but calc_delta_fair() would traverse the tree starting from the leaves to compute the dynamic load. Thanks. -- Guillaume -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Jan 29, 2008 6:47 AM, Srivatsa Vaddagiri [EMAIL PROTECTED] wrote: IMHO this is expected results and if someone really needs to cut down this latency, they can reduce sysctl_sched_latency (which will be bad from perf standpoint, as we will cause more cache thrashing with that). Thank you very much for the detailed explanation Srivatsa, that made a lot of sense. Unfortunately, it means I'll disable FAIR_USER_SCHED as I initially thought these latencies were caused by my local patches that give each group a load proportional to the max load of its elements. Anyway, I don't absolutely need a fair user scheduler on my laptop, but low latencies in the default configuration are nice to have. I just thought about something to restore low latencies with FAIR_GROUP_SCHED, but it's possibly utter nonsense, so bear with me ;-) The idea would be to reverse the trees upside down. The scheduler would only see tasks (on the leaves) so could apply its interactivity magic, but the hierarchical groups would be used to compute dynamic loads for each task according to their position in the tree: - now: - we schedule each level of the tree starting from the root - with my proposition: - we schedule tasks like with !FAIR_GROUP_SCHED, but calc_delta_fair() would traverse the tree starting from the leaves to compute the dynamic load. Thanks. -- Guillaume -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Tue, Jan 29, 2008 at 04:53:56PM +0100, Guillaume Chazarain wrote: I just thought about something to restore low latencies with FAIR_GROUP_SCHED, but it's possibly utter nonsense, so bear with me ;-) The idea would be to reverse the trees upside down. The scheduler would only see tasks (on the leaves) so could apply its interactivity magic, but the hierarchical groups would be used to compute dynamic loads for each task according to their position in the tree: I think this is equivalent to flattening the hierarchy? We discussed this a bit sometime back [1], but one of its weaknesses is providing strong partitioning between groups when it comes to ensuring fairness. Ex: imagine a group which does a fork-bomb. With the flattened tree, it affects other groups more than it would with a 1-level deep hierarchy. Having said that, I would be interested to hear other solutions that maintain this good partitioning b/n groups and still provide good interactivity! 1. http://lkml.org/lkml/2007/5/30/300 - now: - we schedule each level of the tree starting from the root - with my proposition: - we schedule tasks like with !FAIR_GROUP_SCHED, but calc_delta_fair() would traverse the tree starting from the leaves to compute the dynamic load. -- Regards, vatsa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Mon, Jan 28, 2008 at 09:13:53PM +0100, Guillaume Chazarain wrote: > Unfortunately it seems to not be completely fixed, with this script: The maximum scheduling latency of a task with group scheduler is: Lmax = latency to schedule group entity at level0 + latency to schedule group entity at level1 + ... latency to schedule task entity at last level More the hierarchical levels, more the latency looks like. This is particularly so because vruntime (and not wall-clock time) is used as the basis of preemption of entities. The latency at each level also depends on number of entities at that level and sysctl_sched_latency/sched_nr_latency setting. In this case, we only have two levels - userid + task. So the max scheduling latency is: Lmax = latency to schedule uid1 group entity (L0) + latency to schedule the sleeper task within uid1 group (L1) In the first script that you had, uid1 had only one sleeper task, while uid2 has two cpu-hogs. This means L1 is always zero for the sleeper task. L0 is also substantially reduced with the patch I sent (giving sleep credit for group level entities). Thus we were able to get low scheduling latencies in the case of first script. The second script you have sent is generating two tasks (sleeper + hog) under uid 1 and one cpuhog task under uid 2. Consequently the group-entity corresponding to uid 1 is always active and hence there is no question of giving credit to it for sleeping. As a result, we should expect worst-case latencies of upto [2 * 10 = 20ms] in this case. The results you have fall within this range. In case of !FAIR_USER_SCHED, the sleeper task always gets sleep-credits and hence its latency is drastically reduced. IMHO this is expected results and if someone really needs to cut down this latency, they can reduce sysctl_sched_latency (which will be bad from perf standpoint, as we will cause more cache thrashing with that). > #!/usr/bin/python > > import os > import time > > SLEEP_TIME = 0.1 > SAMPLES = 5 > PRINT_DELAY = 0.5 > > def print_wakeup_latency(): > times = [] > last_print = 0 > while True: > start = time.time() > time.sleep(SLEEP_TIME) > end = time.time() > times.insert(0, end - start - SLEEP_TIME) > del times[SAMPLES:] > if end > last_print + PRINT_DELAY: > copy = times[:] > copy.sort() > print '%f ms' % (copy[len(copy)/2] * 1000) > last_print = end > > if os.fork() == 0: > if os.fork() == 0: > os.setuid(1) > while True: > pass > else: > os.setuid(2) > while True: > pass > else: > os.setuid(1) > print_wakeup_latency() > > I get seemingly unpredictable latencies (with or without the patch applied): > > # ./sched.py > 14.810944 ms > 19.829893 ms > 1.968050 ms > 8.021021 ms > -0.017977 ms > 4.926109 ms > 11.958027 ms > 5.995893 ms > 1.992130 ms > 0.007057 ms > 0.217819 ms > -0.004864 ms > 5.907202 ms > 6.547832 ms > -0.012970 ms > 0.209951 ms > -0.002003 ms > 4.989052 ms > > Without FAIR_USER_SCHED, latencies are consistently in the noise. > Also, I forgot to mention that I'm on a single CPU. > > Thanks for the help. > > -- > Guillaume -- Regards, vatsa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
Unfortunately it seems to not be completely fixed, with this script: #!/usr/bin/python import os import time SLEEP_TIME = 0.1 SAMPLES = 5 PRINT_DELAY = 0.5 def print_wakeup_latency(): times = [] last_print = 0 while True: start = time.time() time.sleep(SLEEP_TIME) end = time.time() times.insert(0, end - start - SLEEP_TIME) del times[SAMPLES:] if end > last_print + PRINT_DELAY: copy = times[:] copy.sort() print '%f ms' % (copy[len(copy)/2] * 1000) last_print = end if os.fork() == 0: if os.fork() == 0: os.setuid(1) while True: pass else: os.setuid(2) while True: pass else: os.setuid(1) print_wakeup_latency() I get seemingly unpredictable latencies (with or without the patch applied): # ./sched.py 14.810944 ms 19.829893 ms 1.968050 ms 8.021021 ms -0.017977 ms 4.926109 ms 11.958027 ms 5.995893 ms 1.992130 ms 0.007057 ms 0.217819 ms -0.004864 ms 5.907202 ms 6.547832 ms -0.012970 ms 0.209951 ms -0.002003 ms 4.989052 ms Without FAIR_USER_SCHED, latencies are consistently in the noise. Also, I forgot to mention that I'm on a single CPU. Thanks for the help. -- Guillaume -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
Hi Srivatsa, On Jan 28, 2008 3:31 AM, Srivatsa Vaddagiri <[EMAIL PROTECTED]> wrote: > Given that sysctl_sched_wakeup_granularity is set to 10ms by default, > this doesn't sound abnormal. Indeed, by lowering sched_wakeup_granularity I get much better latencies, but lowering sched_latency seems to be more effective. > NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and > not group-level entities. With the patch attached, I could see that wakeup > latencies with FAIR_USER_SCHED are restored to the same level as > !FAIR_USER_SCHED. Thanks for the patch, it works perfectly. > However I am not sure whether that is the way to go. We want to let one group > of > tasks running as much as possible until the fairness/wakeup-latency threshold > is > exceeded. If someone does want better wakeup latencies between groups too, > they > can always tune sysctl_sched_wakeup_granularity. Having an inconsistency here between FAIR_USER_SCHED and !FAIR_USER_SCHED sounds strange, but Ingo took the patch, so I'm happy :-) Thanks for the replies. -- Guillaume -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
* Srivatsa Vaddagiri <[EMAIL PROTECTED]> wrote: > NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and > not group-level entities. With the patch attached, I could see that > wakeup latencies with FAIR_USER_SCHED are restored to the same level > as !FAIR_USER_SCHED. > > However I am not sure whether that is the way to go. We want to let > one group of tasks running as much as possible until the > fairness/wakeup-latency threshold is exceeded. If someone does want > better wakeup latencies between groups too, they can always tune > sysctl_sched_wakeup_granularity. the patch does look like the right thing to do. There's nothing special about 'groups' versus 'tasks' in terms of scheduling. And most importantly, this solves the behavioral assymetry observed by Guillaume as well - which makes it an obvious-to-add regression fix. I've added your patch to the scheduler queue. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
Unfortunately it seems to not be completely fixed, with this script: #!/usr/bin/python import os import time SLEEP_TIME = 0.1 SAMPLES = 5 PRINT_DELAY = 0.5 def print_wakeup_latency(): times = [] last_print = 0 while True: start = time.time() time.sleep(SLEEP_TIME) end = time.time() times.insert(0, end - start - SLEEP_TIME) del times[SAMPLES:] if end last_print + PRINT_DELAY: copy = times[:] copy.sort() print '%f ms' % (copy[len(copy)/2] * 1000) last_print = end if os.fork() == 0: if os.fork() == 0: os.setuid(1) while True: pass else: os.setuid(2) while True: pass else: os.setuid(1) print_wakeup_latency() I get seemingly unpredictable latencies (with or without the patch applied): # ./sched.py 14.810944 ms 19.829893 ms 1.968050 ms 8.021021 ms -0.017977 ms 4.926109 ms 11.958027 ms 5.995893 ms 1.992130 ms 0.007057 ms 0.217819 ms -0.004864 ms 5.907202 ms 6.547832 ms -0.012970 ms 0.209951 ms -0.002003 ms 4.989052 ms Without FAIR_USER_SCHED, latencies are consistently in the noise. Also, I forgot to mention that I'm on a single CPU. Thanks for the help. -- Guillaume -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Mon, Jan 28, 2008 at 09:13:53PM +0100, Guillaume Chazarain wrote: Unfortunately it seems to not be completely fixed, with this script: The maximum scheduling latency of a task with group scheduler is: Lmax = latency to schedule group entity at level0 + latency to schedule group entity at level1 + ... latency to schedule task entity at last level More the hierarchical levels, more the latency looks like. This is particularly so because vruntime (and not wall-clock time) is used as the basis of preemption of entities. The latency at each level also depends on number of entities at that level and sysctl_sched_latency/sched_nr_latency setting. In this case, we only have two levels - userid + task. So the max scheduling latency is: Lmax = latency to schedule uid1 group entity (L0) + latency to schedule the sleeper task within uid1 group (L1) In the first script that you had, uid1 had only one sleeper task, while uid2 has two cpu-hogs. This means L1 is always zero for the sleeper task. L0 is also substantially reduced with the patch I sent (giving sleep credit for group level entities). Thus we were able to get low scheduling latencies in the case of first script. The second script you have sent is generating two tasks (sleeper + hog) under uid 1 and one cpuhog task under uid 2. Consequently the group-entity corresponding to uid 1 is always active and hence there is no question of giving credit to it for sleeping. As a result, we should expect worst-case latencies of upto [2 * 10 = 20ms] in this case. The results you have fall within this range. In case of !FAIR_USER_SCHED, the sleeper task always gets sleep-credits and hence its latency is drastically reduced. IMHO this is expected results and if someone really needs to cut down this latency, they can reduce sysctl_sched_latency (which will be bad from perf standpoint, as we will cause more cache thrashing with that). #!/usr/bin/python import os import time SLEEP_TIME = 0.1 SAMPLES = 5 PRINT_DELAY = 0.5 def print_wakeup_latency(): times = [] last_print = 0 while True: start = time.time() time.sleep(SLEEP_TIME) end = time.time() times.insert(0, end - start - SLEEP_TIME) del times[SAMPLES:] if end last_print + PRINT_DELAY: copy = times[:] copy.sort() print '%f ms' % (copy[len(copy)/2] * 1000) last_print = end if os.fork() == 0: if os.fork() == 0: os.setuid(1) while True: pass else: os.setuid(2) while True: pass else: os.setuid(1) print_wakeup_latency() I get seemingly unpredictable latencies (with or without the patch applied): # ./sched.py 14.810944 ms 19.829893 ms 1.968050 ms 8.021021 ms -0.017977 ms 4.926109 ms 11.958027 ms 5.995893 ms 1.992130 ms 0.007057 ms 0.217819 ms -0.004864 ms 5.907202 ms 6.547832 ms -0.012970 ms 0.209951 ms -0.002003 ms 4.989052 ms Without FAIR_USER_SCHED, latencies are consistently in the noise. Also, I forgot to mention that I'm on a single CPU. Thanks for the help. -- Guillaume -- Regards, vatsa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
* Srivatsa Vaddagiri [EMAIL PROTECTED] wrote: NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and not group-level entities. With the patch attached, I could see that wakeup latencies with FAIR_USER_SCHED are restored to the same level as !FAIR_USER_SCHED. However I am not sure whether that is the way to go. We want to let one group of tasks running as much as possible until the fairness/wakeup-latency threshold is exceeded. If someone does want better wakeup latencies between groups too, they can always tune sysctl_sched_wakeup_granularity. the patch does look like the right thing to do. There's nothing special about 'groups' versus 'tasks' in terms of scheduling. And most importantly, this solves the behavioral assymetry observed by Guillaume as well - which makes it an obvious-to-add regression fix. I've added your patch to the scheduler queue. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
Hi Srivatsa, On Jan 28, 2008 3:31 AM, Srivatsa Vaddagiri [EMAIL PROTECTED] wrote: Given that sysctl_sched_wakeup_granularity is set to 10ms by default, this doesn't sound abnormal. Indeed, by lowering sched_wakeup_granularity I get much better latencies, but lowering sched_latency seems to be more effective. NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and not group-level entities. With the patch attached, I could see that wakeup latencies with FAIR_USER_SCHED are restored to the same level as !FAIR_USER_SCHED. Thanks for the patch, it works perfectly. However I am not sure whether that is the way to go. We want to let one group of tasks running as much as possible until the fairness/wakeup-latency threshold is exceeded. If someone does want better wakeup latencies between groups too, they can always tune sysctl_sched_wakeup_granularity. Having an inconsistency here between FAIR_USER_SCHED and !FAIR_USER_SCHED sounds strange, but Ingo took the patch, so I'm happy :-) Thanks for the replies. -- Guillaume -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Sun, Jan 27, 2008 at 09:01:15PM +0100, Guillaume Chazarain wrote: > I noticed some strangely high wake up latencies with FAIR_USER_SCHED > using this script: > We have two busy loops with UID=1. > And UID=2 maintains the running median of its wake up latency. > I get these latencies: > > # ./sched.py > 4.300022 ms > 4.801178 ms > 4.604006 ms Given that sysctl_sched_wakeup_granularity is set to 10ms by default, this doesn't sound abnormal. > Disabling FAIR_USER_SCHED restores wake up latencies in the noise: > > # ./sched.py > -0.156975 ms > -0.067091 ms > -0.022984 ms The reason why we are getting better wakeup latencies for !FAIR_USER_SCHED is because of this snippet of code in place_entity(): if (!initial) { /* sleeps upto a single latency don't count. */ if (sched_feat(NEW_FAIR_SLEEPERS) && entity_is_task(se)) ^^ vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */ vruntime = max_vruntime(se->vruntime, vruntime); } NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and not group-level entities. With the patch attached, I could see that wakeup latencies with FAIR_USER_SCHED are restored to the same level as !FAIR_USER_SCHED. However I am not sure whether that is the way to go. We want to let one group of tasks running as much as possible until the fairness/wakeup-latency threshold is exceeded. If someone does want better wakeup latencies between groups too, they can always tune sysctl_sched_wakeup_granularity. > Strangely enough, another way to restore normal latencies is to change > setuid(2) to setuid(1), that is, putting the latency measurement in > the same group as the two busy loops. -- Regards, vatsa --- kernel/sched_fair.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: current/kernel/sched_fair.c === --- current.orig/kernel/sched_fair.c +++ current/kernel/sched_fair.c @@ -511,7 +511,7 @@ if (!initial) { /* sleeps upto a single latency don't count. */ - if (sched_feat(NEW_FAIR_SLEEPERS) && entity_is_task(se)) + if (sched_feat(NEW_FAIR_SLEEPERS)) vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */
High wake up latencies with FAIR_USER_SCHED
Hi, I noticed some strangely high wake up latencies with FAIR_USER_SCHED using this script: #!/usr/bin/python import os import time SLEEP_TIME = 0.1 SAMPLES = 100 PRINT_DELAY = 0.5 def print_wakeup_latency(): times = [] last_print = 0 while True: start = time.time() time.sleep(SLEEP_TIME) end = time.time() times.insert(0, end - start - SLEEP_TIME) del times[SAMPLES:] if end > last_print + PRINT_DELAY: copy = times[:] copy.sort() print '%f ms' % (copy[len(copy)/2] * 1000) last_print = end if os.fork() == 0: os.setuid(1) for i in xrange(2): if os.fork() == 0: while True: pass else: os.setuid(2) # <-- here print_wakeup_latency() We have two busy loops with UID=1. And UID=2 maintains the running median of its wake up latency. I get these latencies: # ./sched.py 4.300022 ms 4.801178 ms 4.604006 ms 4.606867 ms 4.604006 ms 4.606867 ms 4.604006 ms 4.606867 ms 4.606867 ms 4.676008 ms 4.604006 ms 4.604006 ms 4.606867 ms Disabling FAIR_USER_SCHED restores wake up latencies in the noise: # ./sched.py -0.156975 ms -0.067091 ms -0.022984 ms -0.022984 ms -0.022030 ms -0.022030 ms -0.022030 ms -0.021076 ms -0.015831 ms -0.015831 ms -0.016069 ms -0.015831 ms Strangely enough, another way to restore normal latencies is to change setuid(2) to setuid(1), that is, putting the latency measurement in the same group as the two busy loops. Thanks in advance for any help. -- Guillaume -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
High wake up latencies with FAIR_USER_SCHED
Hi, I noticed some strangely high wake up latencies with FAIR_USER_SCHED using this script: #!/usr/bin/python import os import time SLEEP_TIME = 0.1 SAMPLES = 100 PRINT_DELAY = 0.5 def print_wakeup_latency(): times = [] last_print = 0 while True: start = time.time() time.sleep(SLEEP_TIME) end = time.time() times.insert(0, end - start - SLEEP_TIME) del times[SAMPLES:] if end last_print + PRINT_DELAY: copy = times[:] copy.sort() print '%f ms' % (copy[len(copy)/2] * 1000) last_print = end if os.fork() == 0: os.setuid(1) for i in xrange(2): if os.fork() == 0: while True: pass else: os.setuid(2) # -- here print_wakeup_latency() We have two busy loops with UID=1. And UID=2 maintains the running median of its wake up latency. I get these latencies: # ./sched.py 4.300022 ms 4.801178 ms 4.604006 ms 4.606867 ms 4.604006 ms 4.606867 ms 4.604006 ms 4.606867 ms 4.606867 ms 4.676008 ms 4.604006 ms 4.604006 ms 4.606867 ms Disabling FAIR_USER_SCHED restores wake up latencies in the noise: # ./sched.py -0.156975 ms -0.067091 ms -0.022984 ms -0.022984 ms -0.022030 ms -0.022030 ms -0.022030 ms -0.021076 ms -0.015831 ms -0.015831 ms -0.016069 ms -0.015831 ms Strangely enough, another way to restore normal latencies is to change setuid(2) to setuid(1), that is, putting the latency measurement in the same group as the two busy loops. Thanks in advance for any help. -- Guillaume -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High wake up latencies with FAIR_USER_SCHED
On Sun, Jan 27, 2008 at 09:01:15PM +0100, Guillaume Chazarain wrote: I noticed some strangely high wake up latencies with FAIR_USER_SCHED using this script: snip We have two busy loops with UID=1. And UID=2 maintains the running median of its wake up latency. I get these latencies: # ./sched.py 4.300022 ms 4.801178 ms 4.604006 ms Given that sysctl_sched_wakeup_granularity is set to 10ms by default, this doesn't sound abnormal. snip Disabling FAIR_USER_SCHED restores wake up latencies in the noise: # ./sched.py -0.156975 ms -0.067091 ms -0.022984 ms The reason why we are getting better wakeup latencies for !FAIR_USER_SCHED is because of this snippet of code in place_entity(): if (!initial) { /* sleeps upto a single latency don't count. */ if (sched_feat(NEW_FAIR_SLEEPERS) entity_is_task(se)) ^^ vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */ vruntime = max_vruntime(se-vruntime, vruntime); } NEW_FAIR_SLEEPERS feature gives credit for sleeping only to tasks and not group-level entities. With the patch attached, I could see that wakeup latencies with FAIR_USER_SCHED are restored to the same level as !FAIR_USER_SCHED. However I am not sure whether that is the way to go. We want to let one group of tasks running as much as possible until the fairness/wakeup-latency threshold is exceeded. If someone does want better wakeup latencies between groups too, they can always tune sysctl_sched_wakeup_granularity. snip Strangely enough, another way to restore normal latencies is to change setuid(2) to setuid(1), that is, putting the latency measurement in the same group as the two busy loops. -- Regards, vatsa --- kernel/sched_fair.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: current/kernel/sched_fair.c === --- current.orig/kernel/sched_fair.c +++ current/kernel/sched_fair.c @@ -511,7 +511,7 @@ if (!initial) { /* sleeps upto a single latency don't count. */ - if (sched_feat(NEW_FAIR_SLEEPERS) entity_is_task(se)) + if (sched_feat(NEW_FAIR_SLEEPERS)) vruntime -= sysctl_sched_latency; /* ensure we never gain time by being placed backwards. */