Multithreading may speed long-running kernel tasks, but overly optimistic parallelization can go wrong if too many helper threads are started on an already-busy system. Such helpers can degrade the performance of other tasks, so they should be sensitive to current CPU utilization[1].
To achieve this, run helpers at MAX_NICE so that their CPU time is proportional to idle CPU time. The main thread that called into ktask naturally runs at its original priority so that it can make progress on a heavily loaded system, as it would if ktask were not in the picture. I tested two different cases in which a non-ktask and a ktask workload compete for the same CPUs with the goal of showing that normal priority (i.e. nice=0) ktask helpers cause the non-ktask workload to run more slowly, whereas MAX_NICE ktask helpers don't. Testing notes: - Each case was run using 8 CPUs on a large two-socket server, with a cpumask allowing all test threads to run anywhere within the 8. - The non-ktask workload used 7 threads and the ktask workload used 8 threads to evaluate how much ktask helpers, rather than the main ktask thread, disturbed the non-ktask workload. - The non-ktask workload was started after the ktask workload and run for less time to maximize the chances that the non-ktask workload would be disturbed. - Runtimes in seconds. Case 1: Synthetic, worst-case CPU contention ktask_test - a tight loop doing integer multiplication to max out on CPU; used for testing only, does not appear in this series stress-ng - cpu stressor ("-c --cpu-method ackerman --cpu-ops 1200"); stress-ng alone (stdev) max_nice (stdev) normal_prio (stdev) ------------------------------------------------------------ ktask_test 96.87 ( 1.09) 90.81 ( 0.29) stress-ng 43.04 ( 0.00) 43.58 ( 0.01) 75.86 ( 0.39) This case shows MAX_NICE helpers make a significant difference compared to normal priority helpers, with stress-ng taking 76% longer to finish when competing with normal priority ktask threads than when run by itself, but only 1% longer when run with MAX_NICE helpers. The 1% comes from the small amount of CPU time MAX_NICE threads are given despite their low priority. Case 2: Real-world CPU contention ktask_vfio - VFIO page pin a 175G kvm guest usemem - faults in 25G of anonymous THP per thread, PAGE_SIZE stride; used to mimic the page clearing that dominates in ktask_vfio so that usemem competes for the same system resources usemem alone (stdev) max_nice (stdev) normal_prio (stdev) ------------------------------------------------------------ ktask_vfio 14.74 ( 0.04) 9.93 ( 0.09) usemem 10.45 ( 0.04) 10.75 ( 0.04) 14.14 ( 0.07) In the more realistic case 2, the effect is similar although not as pronounced. The usemem threads take 35% longer to finish with normal priority ktask threads than when run alone, but only 3% longer when MAX_NICE is used. All ktask users outside of VFIO boil down to page clearing, so I imagine the results would be similar for them. [1] lkml.kernel.org/r/20171206143509.gg7...@dhcp22.suse.cz Signed-off-by: Daniel Jordan <daniel.m.jor...@oracle.com> --- kernel/ktask.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/kernel/ktask.c b/kernel/ktask.c index b91c62f14dcd..72293a0f50c3 100644 --- a/kernel/ktask.c +++ b/kernel/ktask.c @@ -575,6 +575,18 @@ void __init ktask_init(void) goto alloc_fail; } + /* + * All ktask worker threads have the lowest priority on the system so + * they don't disturb other workloads. + */ + attrs->nice = MAX_NICE; + + ret = apply_workqueue_attrs(ktask_wq, attrs); + if (ret != 0) { + pr_warn("disabled (couldn't apply attrs to ktask_wq)"); + goto apply_fail; + } + attrs->no_numa = true; ret = apply_workqueue_attrs(ktask_nonuma_wq, attrs); -- 2.19.1