I have a programe to test CFS latency or fairness on high load, the test shows CFS has high latency when disable autogroup
the program is in https://gist.github.com/anonymous/af21ae289cfa5c6310eeac73b7a478ff. the programe creates many threads, every thread does thing like: do { now =gettime() loop to spend 10 microseconds delta = gettime()-now if (delta > max_latency) { max_latency=delta printf( max_latency); } } while(1) I run this program on a machine with 48 processor, this programe created 2000 thread, then every processor has about 40 pthread, every thread should finish its 10ms computation in 400ms= 40pthread*10ms but test result show some thread takes very long time # ./pipe_test -n 2000 eat cpu with 2000 threads, delay time 1000 ms 648 delay 1037 ms 48923 delay 1038 ms 1810 delay 1349 ms 49142 delay 1482 ms 1728 delay 1574 ms 1518 delay 1713 ms 808 delay 1714 ms 1702 delay 1733 ms 49004 delay 1783 ms 48821 delay 1785 ms 451 delay 1865 ms 990 delay 1910 ms 1626 delay 1957 ms 537 delay 2420 ms 2021 delay 3242 ms 763 delay 3488 ms 622 delay 3614 ms 1887 delay 4474 ms 1267 delay 4924 ms 721 delay 5406 ms 1810 delay 5873 ms 1470 delay 5969 ms 1702 delay 7834 ms 48821 delay 8063 ms ^C The kernel version is 4.9.23 and I disable autogroup; if autogroup is enabled, no this issue ftrace result: <...>-48821 [003] d... 64151.635476: sched_switch: prev_comm=a.out prev_pid=48821 prev_prio=120 prev_state=R ==> next_comm=baas_agent next_pid=17118 next_prio=120 <...>-48821 [003] d... 64151.636531: sched_switch: prev_comm=a.out prev_pid=48821 prev_prio=120 prev_state=R ==> next_comm=baas_agent next_pid=17118 next_prio=120 <...>-48821 [003] d... 64151.639570: sched_switch: prev_comm=a.out prev_pid=48821 prev_prio=120 prev_state=R ==> next_comm=baas_agent next_pid=17275 next_prio=120 <...>-48821 [003] d... 64159.703051: sched_switch: prev_comm=a.out prev_pid=48821 prev_prio=120 prev_state=R+ ==> next_comm=kworker/u97:0 next_pid=36929 next_prio=120 <...>-48821 [003] d... 64159.703091: sched_switch: prev_comm=a.out prev_pid=48821 prev_prio=120 prev_state=R ==> next_comm=kworker/u97:0 next_pid=36929 next_prio=120 <...>-48821 [003] d... 64159.703978: sched_switch: prev_comm=a.out prev_pid=48821 prev_prio=120 prev_state=R ==> next_comm=baas_agent next_pid=17275 next_prio=120 <...>-48821 [003] d... 64159.705542: sched_switch: prev_comm=a.out prev_pid=48821 prev_prio=120 prev_state=R ==> next_comm=baas_agent next_pid=16879 next_prio=120 # grep sched_migrate_task trace|grep 48821 <...>-688 [019] d... 64161.828654: sched_migrate_task: comm=a.out pid=48821 prio=120 orig_cpu=3 dest_cpu=19 <...>-48821 [019] d... 64161.828862: sched_migrate_task: comm=a.out pid=49053 prio=120 orig_cpu=43 dest_cpu=19 #