On Tue, Feb 02, 2016 at 12:08:46AM +0100, Rasmus Villemoes wrote: > On Mon, Feb 01 2016, Andi Kleen <a...@linux.intel.com> wrote: > > > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > >> On Thu, Jan 28 2016, Andi Kleen <a...@firstfloor.org> wrote: > >> > >> > From: Andi Kleen <a...@linux.intel.com> > >> > > >> > The menu cpuidle governor does at least two int_sqrt() each time > >> > we go into idle in get_typical_interval to compute stddev > >> > > >> > int_sqrts take 100-120 cycles each. Short idle latency is important > >> > for many workloads. > >> > > >> > >> If you want to optimize get_typical_interval(), why not just take the > >> square root out of the equation (literally)? > >> > >> Something like > > > > Looks good. Yes that's a better fix. > > > > Thanks. (Is there a good way to tell gcc that avg*avg is actually a > 32x32->64 multiplication?)
I don't think there is, but you could define a custom macro with a fallback on pure 64x64->64. -Andi