On Mon, Feb 22, 2016 at 11:52 AM, Peter Zijlstra <[email protected]> wrote: > On Fri, Feb 19, 2016 at 09:28:23AM -0800, Steve Muckle wrote: >> On 02/19/2016 08:42 AM, Srinivas Pandruvada wrote: >> > We did experiments using util/max in intel_pstate. For some benchmarks >> > there were regression of 4 to 5%, for some benchmarks it performed at >> > par with getting utilization from the processor. Further optimization >> > in the algorithm is possible and still in progress. Idea is that we can >> > change P-State fast enough and be more reactive. Once I have good data, >> > I will send to this list. The algorithm can be part of the cpufreq >> > governor too. >> >> There has been a lot of work in the area of scheduler-driven CPU >> frequency selection by Linaro and ARM as well. It was posted most >> recently a couple months ago: >> >> http://thread.gmane.org/gmane.linux.power-management.general/69176 >> >> It was also posted as part of the energy-aware scheduling series last >> July. There's a new RFC series forthcoming which I had hoped (and >> failed) to post prior to my business travel this week; it should be out >> next week. It will address the feedback received thus far along with >> locking and other things. > > Right, so I had a wee look at that again, and had a quick chat with Juri > on IRC. So the main difference seems to be that you guys want to know > why the utilization changed, as opposed to purely _that_ it changed. > > And hence you have callbacks all over the place. > > I'm not too sure I really like that too much, it bloats the code and > somewhat obfuscates the point. > > So I would really like there to be just the one callback when we > actually compute a new number, and that is update_load_avg(). > > Now I think we can 'easily' propagate the information you want into > update_load_avg() (see below), but I would like to see actual arguments > for why you would need this. > > For one, the migration bits don't really make sense. We typically do not > call migration code local on both cpus, typically just one, but possibly > neither. That means you cannot actually update the relevant CPU state > from these sites anyway. > >> The scheduler hooks for utilization-based cpufreq operation deserve a >> lot more debate I think. They could quite possibly have different >> requirements than hooks which are chosen just to guarantee periodic >> callbacks into sampling-based governors. > > I'll repeat what Rafael said, the periodic callback nature is a > 'temporary' hack, simply because current cpufreq depends on that. > > The idea is to wane cpufreq off of that requirement and then drop that > part.
Right and I can see at least a couple of ways to do that, but it'll depend on where the final hooks will be located and what arguments they will pass. Thanks, Rafael

