On Fri, Jan 29, 2016 at 09:35:22AM -0800, Andy Lutomirski wrote: > I'll fiddle with that benchmark a little bit. Maybe I can make it > suck less. If anyone knows a good non-micro benchmark for this, let > me know.
Yeah, I don't know of a good one. The TLB and all those intermediary walker caches modern x86 CPUs have are really good. So it is hard to measure any improvements there. I guess in this particular case, if one can't measure slowdowns and the code is simple enough, then we're good enough. In theory, we are carefully killing less TLB entries and the related cached page walker data so that should be a good thing... > I refuse to use dbus as my benchmark :) Ha! > FWIW, I benchmarked cr4 vs invpcid by adding a prctl and calling it in > a loop. Apparently INVPCID is faster than the two CR4 writes. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply.