On Mon, Oct 19, 2015 at 03:58:16PM -0700, Andi Kleen wrote: > Switch the cycles:pp alias from UOPS_RETITRED to INST_RETIRED.PREC_DIST. > The basic mechanism of abusing the inverse cmask to get all cycles > works the same as before. > > PREC_DIST has special support for avoiding shadow effects, which > can give better results compare to UOPS_RETIRED. The drawback is > that PREC_DIST can only schedule on counter 1, but that is ok for > cycle sampling, as there is normally no need to do multiple cycle > sampling runs in parallel. It is still possible to run perf top > in parallel, as that doesn't use precise mode. Also of course > the multiplexing can still allow parallel operation.
So the worry I have with this is that there might indeed be people wanting to use this in parallel. Typically on workstations you do not, because there's only a single user, but on servers it might be more common. The thing I expect to be most common is having both a CPU wide and a per task cycle counter enabled. This means a fairly visible change in behaviour depending on uarch. And you having killed the flag bits for PEBS events precludes people from using this manually, right? I think we want to exempt .inv=1 .cmask=16 from that general rule on general utility value. We could maybe abuse .precise_ip = 3 for this? > On earlier parts there were various hardware bugs in it > (but no show stopper on IvyBridge and up I believe), > so it could be enabled there after sufficient testing. Just enable it for IVB+ then. > On Sandy Bridge PREC_DIST can only be scheduled as a single > event on the PMU, which is too limiting. Before Sandy > Bridge it was not supported. Right, that was a bit cumbersome :-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

