On 03 Jan 2007, at 12:05, Stephane Eranian wrote:

Did you try other events? Also PIN only counts user level instructions. did you make sure your measurements were setup to count only at user level?


I haven't tried other events yet, because those are a lot harder to check. I've set the counters to count only user-level instructions (setting it to kernel yields a much higher count).

I've been experimenting with this, and it seems the P4 counters just aren't reporting the correct number of instructions. Some instructions seem to be counted double, or even more.

The highest difference for the SPEC CPU2000 benchmarks is observed for mesa:

PIN: 291,680,398,079
perfex @ Pentium4: 298,575,250,612

That's a difference of 6,8 * 10^9 instructions, or 2.36% of the total execution, which is _huge_ . My needs, as I've explained before, only allow a difference of 100,000 instructions, so this is a real showstopper for me. Clearly, perfmon is not to blame here, nor is perfctr. Something is just wrong with the hardware implementation the way I see it...

As a note for future users of instr_retired on Intel Pentium 4 with any tool (perfctr, perfex, PAPI, perfmon, ...): be very carefull with the results you're getting, because it appears some instructions cause multiple increases of the counters, which leads to misleading results. The settings I'm using (with perfex) are:

perfex -e 0x00039000/[EMAIL PROTECTED] <benchmark>

for instr_completed (which yields a zero count for me):

perfex -e 0x00039000/[EMAIL PROTECTED] <benchmark>

Any additional comments on this are welcome, but I won't be losing anymore time over this. The counts on AMD machines are looking a lot better, so I'll just go with AMD.

Also have you tried in system-wide mode, just to verify that there is nothing
wrong with the PMU context switch code.


Yep, that yields higher counts. Kernel-only yields the difference between system-wide and user-level.

As for instr_completed, I have never been able to measure it correctly. There may be unpublished constraints on this event which libpfm does not
know about.

I've tried using perfex, using the correct settings, but I'm getting a zero count every time I try (on two different machines). No idea what's causing this. There are some undocumented issues for sure....

greetings,

Kenneth

--

Statistics are like a bikini. What they reveal is suggestive, but what they conceal is vital (Aaron Levenstein)

Kenneth Hoste
ELIS - Ghent University
[EMAIL PROTECTED]
http://www.elis.ugent.be/~kehoste


_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to