On 25-Jun 10:07, Parth Shah wrote: [...]
> Implementation > ============== > > These patches uses UCLAMP mechanism[2] used to clamp utilization from the > userspace, which can be used to classify the jitter tasks. The task wakeup > logic uses this information to pack such tasks onto cores which are already > running busy with CPU intensive tasks. The task packing is done at > `select_task_rq_fair` only so that in case of wrong decision load balancer > may pull the classified jitter tasks for maximizing performance. > > Any tasks clamped with cpu.util.max=1 (with sched_setattr syscall) are > classified as jitter tasks. I don't like this approach, it's overloading the meaning of clamps and it also brings in un-wanted side effects, like running jitter tasks at the minimum OPP. Do you have any expected minimum frequency for those jitter tasks ? I expect those to be relatively small tasks but still perhaps it makes sense to run them on higher then minimal OPP. Why not just adding a new dedicated per-task scheduling attribute, e.g. SCHED_FLAG_LATENCY_TOLERANT, and manage it via sched_{set,get}attr() ? I guess such a concept could work well on defining a generic spread-vs-pack wakeup policy which is something Android also could benefit from. However, what we will still be missing is a proper cgroups support. Not always is possible and/or convenient to explicitly set per-task attributes. But at the same time, AFAIK using cgroups to define task properties which do not represent a "resource repartition" is something very difficult to get accepted mainline. In the past, back in 2011, there was an attempt to introduce a timer slack controller, but apparently it was not very well received: Message-ID: <1300111524-5666-1-git-send-email-kir...@shutemov.name> https://lore.kernel.org/lkml/20110314164652.5b44fb9e.a...@linux-foundation.org/ But perhaps now the times are more mature and we can try to come up with compelling cases from both the server and the mobile world. > We define a core to be non-idle if it is over 12.5% utilized of its > capacity; This looks like a random number, can you elaborate on that? > the jitters are packed over these cores using First-fit > approach. > > To demonstrate/benchmark, one can use a synthetic workload generator > `turbo_bench.c`[1] available at > https://github.com/parthsl/tools/blob/master/benchmarks/turbo_bench.c > > Following snippet demonstrates the use of TurboSched feature: > ``` > i=8; ./turbo_bench -t 30 -h $i -n $((i*2)) -j > ``` > > Current implementation uses only jitter classified tasks to be packed on > the first busy cores, but can be further optimized by getting userspace > input of important tasks and keeping track of such tasks. > This leads to optimized searching of non idle cores and also more > accurate as userspace hints are safer than auto classified busy > cores/tasks. Hints from user-space looks like an interesting concept, could you better elaborate what you are thinking about in this sense? -- #include <best/regards.h> Patrick Bellasi