Brendan Gregg <[email protected]> writes:
>
> Despite millions of samples, many NOPs are never seen. (See the
> Percent column.) I'm not using PEBS, but I suppose I should.
Yes you should. You get better results with :p / :pp (PEBS)
and the best results (but not at the cycles level) with
INST_RETIRED.PREC_DIST / ALL
cycles:upp
│ 0000000000400400 <main>:
0.05 │ 0: nop
│ nop
│ nop
19.78 │ nop
0.17 │ nop
│ nop
│ nop
20.16 │ nop
0.12 │ nop
│ nop
│ nop
20.49 │ nop
0.07 │ nop
│ nop
│ nop
18.88 │ nop
20.30 │ ↑ jmp 0
cpu/event=0xc0,umask=0x1,name=inst_retired_prec_dist/pp
│ 0000000000400400 <main>:
6.13 │ 0: nop
│ nop
0.02 │ nop
0.02 │ nop
23.69 │ nop
│ nop
│ nop
0.02 │ nop
24.05 │ nop
│ nop
│ nop
0.02 │ nop
23.60 │ nop
│ nop
│ nop
│ nop
22.46 │ ↑ jmp 0
This is nearly as good as you can get here because the machine can
retire four nops per cycle
A common trick is to run PREC_DIST/ALL in parallel with other events
and correlate.
> I think Andi mentioned this to me last year -- that instruction
> profiling was no longer reliable.
It never was.
> Is this due to parallel and out-of-order execution? (ie, we're
> sampling the instruction pointer, but that's set to the resumption
> instruction, not the instructions being processed in the backend?).
Most problems are due to 'skid': It takes some time to trigger the
profiling interrupt after the event fired. Without PEBS the skid is
quite high. With PEBS it's a lot better because it writes out
the information into the PEBS buffer faster, but also not zero and can
still be noticed. With PREC_DIST/ALL it does some additional tricks
to further reduce it.
There are also other problems, for example an event may not be tied
to an instruction. Some events have inherently large skid.
-Andi
--
[email protected] -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html