Hi Thara, On Wednesday 10 Oct 2018 at 08:17:51 (+0200), Ingo Molnar wrote: > > * Thara Gopinath <thara.gopin...@linaro.org> wrote: > > > Thermal governors can respond to an overheat event for a cpu by > > capping the cpu's maximum possible frequency. This in turn > > means that the maximum available compute capacity of the > > cpu is restricted. But today in linux kernel, in event of maximum > > frequency capping of a cpu, the maximum available compute > > capacity of the cpu is not adjusted at all. In other words, scheduler > > is unware maximum cpu capacity restrictions placed due to thermal > > activity. This patch series attempts to address this issue. > > The benefits identified are better task placement among available > > cpus in event of overheating which in turn leads to better > > performance numbers. > > > > The delta between the maximum possible capacity of a cpu and > > maximum available capacity of a cpu due to thermal event can > > be considered as thermal pressure. Instantaneous thermal pressure > > is hard to record and can sometime be erroneous as there can be mismatch > > between the actual capping of capacity and scheduler recording it. > > Thus solution is to have a weighted average per cpu value for thermal > > pressure over time. The weight reflects the amount of time the cpu has > > spent at a capped maximum frequency. To accumulate, average and > > appropriately decay thermal pressure, this patch series uses pelt > > signals and reuses the available framework that does a similar > > bookkeeping of rt/dl task utilization. > > > > Regarding testing, basic build, boot and sanity testing have been > > performed on hikey960 mainline kernel with debian file system. > > Further aobench (An occlusion renderer for benchmarking realworld > > floating point performance) showed the following results on hikey960 > > with debain. > > > > Result Standard > > Standard > > (Time secs) Error > > Deviation > > Hikey 960 - no thermal pressure applied 138.67 6.52 > > 11.52% > > Hikey 960 - thermal pressure applied 122.37 5.78 > > 11.57% > > Wow, +13% speedup, impressive! We definitely want this outcome. > > I'm wondering what happens if we do not track and decay the thermal load at > all at the PELT > level, but instantaneously decrease/increase effective CPU capacity in > reaction to thermal > events we receive from the CPU.
+1, it's not that obvious (to me at least) that averaging the thermal pressure over time is necessarily what we want. Say the thermal governor caps a CPU and 'removes' 70% of its capacity, it will take forever for the PELT signal to ramp-up to that level before the scheduler can react. And the other way around, if you release the cap, it'll take a while before we actually start using the newly available capacity. I can also imagine how reacting too fast can be counter-productive, but I guess having numbers and/or use-cases to show that would be great :-) Thara, have you tried to experiment with a simpler implementation as suggested by Ingo ? Also, assuming that we do want to average things, do we actually want to tie the thermal ramp-up time to the PELT half life ? That provides nice maths properties wrt the other signals, but it's not obvious to me that this thermal 'constant' should be the same on all platforms. Or maybe it should ? Thanks, Quentin > > You describe the averaging as: > > > Instantaneous thermal pressure is hard to record and can sometime be > > erroneous as there can > > be mismatch between the actual capping of capacity and scheduler recording > > it. > > Not sure I follow the argument here: are there bogus thermal throttling > events? If so then > they are hopefully not frequent enough and should average out over time even > if we follow > it instantly. > > I.e. what is 'can sometimes be erroneous', exactly? > > Thanks, > > Ingo