On Thu, 2015-10-08 at 13:19 +0200, Peter Zijlstra wrote: > On Thu, Oct 08, 2015 at 09:54:21PM +1100, paul.sz...@sydney.edu.au wrote: > > Good to see that you agree on the fairness issue... it MUST be fixed! > > CFS might be wrong or wasteful, but never unfair. > > I've not yet had time to look at the case at hand, but there are wat is > called 'infeasible weight' scenarios for which it is impossible to be > fair.
And sometimes, group wide fairness ain't all that wonderful anyway. > Also, CFS must remain a practical scheduler, which places bounds on the > amount of weird cases we can deal with. Yup, and on a practical note... master, 1 group of 8 (oink) vs 8 groups of 1 (pert) PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 5618 root 20 0 8312 840 744 R 90.46 0.005 1:40.48 0 pert 5630 root 20 0 8312 720 624 R 90.46 0.004 1:38.40 4 pert 5615 root 20 0 8312 768 672 R 89.48 0.005 1:39.25 6 pert 5621 root 20 0 8312 792 696 R 89.34 0.005 1:38.49 2 pert 5627 root 20 0 8312 760 664 R 89.06 0.005 1:36.53 5 pert 5645 root 20 0 8312 804 708 R 89.06 0.005 1:34.69 1 pert 5624 root 20 0 8312 716 620 R 88.64 0.004 1:38.45 7 pert 5612 root 20 0 8312 716 620 R 83.03 0.004 1:40.11 3 pert 5633 root 20 0 8312 792 696 R 10.94 0.005 0:11.59 4 oink 5635 root 20 0 8312 804 708 R 10.80 0.005 0:11.74 2 oink 5637 root 20 0 8312 796 700 R 10.80 0.005 0:11.34 5 oink 5639 root 20 0 8312 836 740 R 10.80 0.005 0:11.71 2 oink 5634 root 20 0 8312 840 744 R 10.66 0.005 0:11.36 7 oink 5636 root 20 0 8312 756 660 R 10.66 0.005 0:11.68 1 oink 5640 root 20 0 8312 752 656 R 10.10 0.005 0:11.41 7 oink 5638 root 20 0 8312 804 708 R 9.818 0.005 0:11.99 7 oink Avg 98.2s per group vs 92.8s for the 8 task group. Not _perfect_, but ok. Before reading further, now would be a good time for readers to chant the "perfect is the enemy of good" mantra, pretending my not so scientific measurements had actually shown perfect group wide distribution. You're gonna see good, and it doesn't resemble perfect.. which is good ;-) master+, 1 group of 8 (oink) vs 8 groups of 1 (pert) PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 19269 root 20 0 8312 716 620 R 77.25 0.004 1:39.43 2 pert 19263 root 20 0 8312 752 656 R 76.65 0.005 1:43.70 7 pert 19257 root 20 0 8312 760 664 R 72.85 0.005 1:37.08 5 pert 19260 root 20 0 8312 804 704 R 71.86 0.005 1:40.42 1 pert 19273 root 20 0 8312 748 652 R 71.26 0.005 1:41.98 6 pert 19266 root 20 0 8312 752 656 R 67.47 0.005 1:41.69 4 pert 19254 root 20 0 8312 744 648 R 61.28 0.005 1:42.88 4 pert 19277 root 20 0 8312 836 740 R 56.29 0.005 0:46.16 5 oink 19281 root 20 0 8312 768 672 R 55.89 0.005 0:42.05 0 oink 19283 root 20 0 8312 840 744 R 44.91 0.005 0:53.05 3 oink 19282 root 20 0 8312 800 704 R 30.74 0.005 0:41.70 3 oink 19284 root 20 0 8312 724 628 R 28.14 0.004 0:42.08 3 oink 19278 root 20 0 8312 752 656 R 25.15 0.005 0:42.26 3 oink 19280 root 20 0 8312 756 660 R 24.35 0.005 0:40.39 3 oink 19279 root 20 0 8312 836 740 R 23.95 0.005 0:45.71 3 oink Avg 101.6s per pert group vs 353.4s for the 8 task oink group. Not remotely fair total group utilization wise. Ah, but now onward to interactivity... master, 8 groups of 1 (pert) vs desktop (mplayer BigBuckBunny-DivXPlusHD.mkv) PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 4068 root 20 0 8312 724 628 R 99.64 0.004 1:04.32 6 pert 4065 root 20 0 8312 744 648 R 99.45 0.005 1:04.92 5 pert 4071 root 20 0 8312 748 652 R 99.27 0.005 1:03.12 7 pert 4077 root 20 0 8312 840 744 R 98.72 0.005 1:01.46 3 pert 4074 root 20 0 8312 796 700 R 98.18 0.005 1:03.38 1 pert 4079 root 20 0 8312 720 624 R 97.99 0.004 1:01.45 4 pert 4062 root 20 0 8312 836 740 R 96.72 0.005 1:03.44 0 pert 4059 root 20 0 8312 720 624 R 94.16 0.004 1:04.92 2 pert 4082 root 20 0 1094400 154324 33592 S 4.197 0.954 0:02.69 0 mplayer 1029 root 20 0 465332 151540 40816 R 3.285 0.937 0:24.59 2 Xorg 1773 root 20 0 662592 73308 42012 S 2.007 0.453 0:12.84 5 konsole 771 root 20 0 11416 1964 1824 S 0.730 0.012 0:10.45 0 rngd 1722 root 20 0 2866772 65224 51152 S 0.365 0.403 0:03.44 2 kwin 1769 root 20 0 711684 54212 38020 S 0.182 0.335 0:00.39 1 kmix That is NOT good. Mplayer and friends need more than that. Interactivity is _horrible_, and buck is an unwatchable mess (no biggy, I know every frame). master+, 8 groups of 1 (pert) vs desktop (mplayer BigBuckBunny-DivXPlusHD.mkv) PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 4346 root 20 0 8312 756 660 R 99.20 0.005 0:59.89 5 pert 4349 root 20 0 8312 748 652 R 98.80 0.005 1:00.77 6 pert 4343 root 20 0 8312 720 624 R 94.81 0.004 1:02.11 2 pert 4331 root 20 0 8312 724 628 R 91.22 0.004 1:01.16 3 pert 4340 root 20 0 8312 720 624 R 91.22 0.004 1:01.06 7 pert 4328 root 20 0 8312 836 740 R 90.42 0.005 1:00.07 4 pert 4334 root 20 0 8312 756 660 R 87.82 0.005 0:59.84 1 pert 4337 root 20 0 8312 824 728 R 76.85 0.005 0:52.20 0 pert 4352 root 20 0 1058812 123876 33388 S 29.34 0.766 0:25.01 3 mplayer 1029 root 20 0 471168 156748 40316 R 22.36 0.969 0:42.23 3 Xorg 1773 root 20 0 663080 74176 42012 S 4.192 0.459 0:17.98 1 konsole 771 root 20 0 11416 1964 1824 R 1.198 0.012 0:13.45 0 rngd 1722 root 20 0 2866880 65340 51152 R 0.599 0.404 0:04.87 3 kwin 1788 root 9 -11 516744 11932 8536 S 0.599 0.074 0:01.01 0 pulseaudio 1733 root 20 0 3369480 141564 71776 S 0.200 0.875 0:05.51 1 plasma-desktop That's good. Interactivity is fine, I can't even tell pert groups exist by watching buck kick squirrel butt for the 10387th time. With master, and one 8 hog group vs desktop/mplayer, I can see the hog group interfere with mplayer. Add another hog group, mplayer lurches quite badly. I can feel even one group while using mouse wheel to scroll through mail. With master+, I see/feel none of that unpleasantness. Conclusion: task group re-weighting is the mortal enemy of a good desktop. sched: disable task group re-weighting on the desktop Task group wide utilization based weight may work well for servers, but it is horrible on the desktop. 8 groups of 1 hog demoloshes interactivity, 1 group of 8 hogs has noticable impact, 2 such groups is very very noticable. Turn it off if autogroup is enabled, and add a feature to let people set the definition of fair to what serves them best. For the desktop, fixed group weight wins hands down, no contest.... Signed-off-by: Mike Galbraith <umgwanakikb...@gmail.com> --- kernel/sched/fair.c | 10 ++++++---- kernel/sched/features.h | 14 ++++++++++++++ 2 files changed, 20 insertions(+), 4 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2372,6 +2372,8 @@ static long calc_cfs_shares(struct cfs_r { long tg_weight, load, shares; + if (!sched_feat(SMP_FAIR_GROUPS)) + return tg->shares; tg_weight = calc_tg_weight(tg, cfs_rq); load = cfs_rq_load_avg(cfs_rq); @@ -2420,10 +2422,10 @@ static void update_cfs_shares(struct cfs se = tg->se[cpu_of(rq_of(cfs_rq))]; if (!se || throttled_hierarchy(cfs_rq)) return; -#ifndef CONFIG_SMP - if (likely(se->load.weight == tg->shares)) - return; -#endif + if (!IS_ENABLED(CONFIG_SMP) || !sched_feat(SMP_FAIR_GROUPS)) { + if (likely(se->load.weight == tg->shares)) + return; + } shares = calc_cfs_shares(cfs_rq, tg); reweight_entity(cfs_rq_of(se), se, shares); --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -88,3 +88,17 @@ SCHED_FEAT(LB_MIN, false) */ SCHED_FEAT(NUMA, true) #endif + +#if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED) +/* + * With SMP_FAIR_GROUPS set, activity group wide determines share for + * all froup members. This does very bad things to interactivity when + * a desktop box is heavily loaded. Default to off when autogroup is + * enabled, and let all users set it to what works best for them. + */ +#ifndef CONFIG_SCHED_AUTOGROUP +SCHED_FEAT(SMP_FAIR_GROUPS, true) +#else +SCHED_FEAT(SMP_FAIR_GROUPS, false) +#endif +#endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/