On Thu, Apr 25, 2019 at 5:55 PM Ingo Molnar <mi...@kernel.org> wrote: > > > * Aubrey Li <aubrey.in...@gmail.com> wrote: > > > On Wed, Apr 24, 2019 at 10:00 PM Julien Desfossez > > <jdesfos...@digitalocean.com> wrote: > > > > > > On 24-Apr-2019 09:13:10 PM, Aubrey Li wrote: > > > > On Wed, Apr 24, 2019 at 12:18 AM Vineeth Remanan Pillai > > > > <vpil...@digitalocean.com> wrote: > > > > > > > > > > Second iteration of the core-scheduling feature. > > > > > > > > > > This version fixes apparent bugs and performance issues in v1. This > > > > > doesn't fully address the issue of core sharing between processes > > > > > with different tags. Core sharing still happens 1% to 5% of the time > > > > > based on the nature of workload and timing of the runnable processes. > > > > > > > > > > Changes in v2 > > > > > ------------- > > > > > - rebased on mainline commit: 6d906f99817951e2257d577656899da02bb33105 > > > > > > > > Thanks to post v2, based on this version, here is my benchmarks result. > > > > > > > > Environment setup > > > > -------------------------- > > > > Skylake server, 2 numa nodes, 104 CPUs (HT on) > > > > cgroup1 workload, sysbench (CPU intensive non AVX workload) > > > > cgroup2 workload, gemmbench (AVX512 workload) > > > > > > > > Case 1: task number < CPU num > > > > -------------------------------------------- > > > > 36 sysbench threads in cgroup1 > > > > 36 gemmbench threads in cgroup2 > > > > > > > > core sched off: > > > > - sysbench 95th percentile latency(ms): avg = 4.952, stddev = 0.55342 > > > > core sched on: > > > > - sysbench 95th percentile latency(ms): avg = 3.549, stddev = 0.04449 > > > > > > > > Due to core cookie matching, sysbench tasks won't be affect by AVX512 > > > > tasks, latency has ~28% improvement!!! > > > > > > > > Case 2: task number > CPU number > > > > ------------------------------------------------- > > > > 72 sysbench threads in cgroup1 > > > > 72 gemmbench threads in cgroup2 > > > > > > > > core sched off: > > > > - sysbench 95th percentile latency(ms): avg = 11.914, stddev = 3.259 > > > > core sched on: > > > > - sysbench 95th percentile latency(ms): avg = 13.289, stddev = 4.863 > > > > > > > > So not only power, now security and performance is a pair of > > > > contradictions. > > > > Due to core cookie not matching and forced idle introduced, latency has > > > > ~12% > > > > regression. > > > > > > > > Any comments? > > > > > > Would it be possible to post the results with HT off as well ? > > > > What's the point here to turn HT off? The latency is sensitive to the > > relationship > > between the task number and CPU number. Usually less CPU number, more run > > queue wait time, and worse result. > > HT-off numbers are mandatory: turning HT off is by far the simplest way > to solve the security bugs in these CPUs. > > Any core-scheduling solution *must* perform better than HT-off for all > relevant workloads, otherwise what's the point? > Got it, I'll measure HT-off cases soon.
Thanks, -Aubrey