On 02/15/2015 11:48 PM, Ingo Molnar wrote:
Linus,Please pull the latest perf-core-for-linus git tree from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf-core-for-linus # HEAD: a66734297f78707ce39d756b656bfae861d53f62 perf/x86: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks
[...]
The extra CR4 manipulation adds ~ <50ns to the context switch cost between rdpmc-capable and rdpmc-non-capable mms.
That's about the best I could benchmark, too -- if it was more than about 50ns, I'm pretty sure I wouldn't seen a difference, but, as it stands, it seems to have been lost in the noise. Maybe I should find a better benchmark.
In any event, this series is probably a mixed bag performance-wise. In the best base, there's a small extra cost in context switches, and, when switching PCE, there's a CR4 write. On SVM guests, the CR4 write will suck.
To balance that out, I removed a CR4 read from VMX entry and from global TLB flushes. The former mostly fixes a performance regression from a security fix a few releases back, and the I expect that the latter will more than offset the added context switch overhead (especially on SVM guests, where even CR4 reads exit AFAIK).
Anyway, I tried and failed to detect any difference at all. Context switch timing was very noisy for me.
--Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

