On Sat, Feb 21, 2015 at 07:39:52PM +0100, Ingo Molnar wrote:
> So the workload improved by ~600,000 usecs, and there's 
> 68,000 less calls, so it saved 8.8 usecs per call. Isn't 

I think you mean more calls. The eager measurement has more calls. Let
me do some primitive math:

def  =(234.681331200 / 712000)*10^6 = 329.60861123595505000000 microsecs/call
eager=(234.066525648 / 780000)*10^6 = 300.08528929230769000000 microsecs/call

diff is 29.52332194364736000000 microsecs speedup per call which could
explain the cost of CR0.TS serialization semantics in the lazy mode.

> that a bit too high?

Now, is 29 microseconds too high? I'm not sure this is even correct and
not some noise interfering.

> I'd sleep a lot better if we had some runtime debug flag to 
> be able to do run-to-run comparisons on the same booted up 
> kernel, or so.

Let me take a look whether we could so some knob... The nice thing is,
code uses use_eager_fpu() to check stuff so we should be able to clear
the cpufeature flag.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to