On Thu, Aug 14, 2014 at 07:47:56PM +0200, Stephane Eranian wrote: > [+perf tool maintainers] > > On Thu, Aug 14, 2014 at 4:30 PM, Andi Kleen <a...@linux.intel.com> wrote: > > > > I understand all your points, but there's no alternative. > > The only other way would be to disable INST_RETIRED.ALL. > > > You cannot do that either. INST_RETIRED:ALL is important. I assume > the bug applies whether or not the event is used with a filter. > > I think we need to ensure that by looking at the perf.data file, one > can reconstruct the total number of inst_Retired:all occurrences for > the run. With a fixed period, one would do num_samples * fixed_period. > I know the Gooda tool does that. It is used to estimate the number of > events captured vs. the number of events occurring.
OK, I think we can make that work; IFF we guarantee perf_event_attr::sample_period >= 128. Suppose we start out with sample_period=192; then we'll set period_left to 192, we'll end up with left = 128 (we truncate the lower bits). We get an interrupt, find that period_left = 64 (>0 so we return 0 and don't get an overflow handler), up that to 128. Then we trigger again, at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get an overflow). We increment with sample_period so we get left = 128. We fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an overflow). And on and on. So while the individual interrupts are 'wrong' we get then with interval=256,128 in exactly the right ratio to average out at 192. And this works for everything >=128. So the num_samples*fixed_period thing is still entirely correct +- 127, which is good enough I'd say, as you already have that error anyhow. So no need to 'fix' the tools, al we need to do is refuse to create INST_RETIRED:ALL events with sample_period < 128.
pgpBN5Q_8qUf9.pgp
Description: PGP signature