On Fri, Sep 21, 2018 at 09:13:08AM +0300, Alexey Budankov wrote: SNIP
> Events: > cpu/period=P,event=0x3c/Duk;CPU_CLK_UNHALTED.THREAD > cpu/period=P,umask=0x3/Duk;CPU_CLK_UNHALTED.REF_TSC > cpu/period=P,event=0xc0/Duk;INST_RETIRED.ANY > cpu/period=0xaae61,event=0xc2,umask=0x10/uk;UOPS_RETIRED.ALL > cpu/period=0x11171,event=0xc2,umask=0x20/uk;UOPS_RETIRED.SCALAR_SIMD > cpu/period=0x11171,event=0xc2,umask=0x40/uk;UOPS_RETIRED.PACKED_SIMD > > ================================================= > > Command: > /usr/bin/time /tmp/vtune_amplifier_2019.574715/bin64/perf.thr record > --threads=T \ > -a -N -B -T -R --call-graph dwarf,1024 --user-regs=ip,bp,sp \ > -e cpu/period=P,event=0x3c/Duk,\ > cpu/period=P,umask=0x3/Duk,\ > cpu/period=P,event=0xc0/Duk,\ > cpu/period=0x30d40,event=0xc2,umask=0x10/uk,\ > cpu/period=0x4e20,event=0xc2,umask=0x20/uk,\ > cpu/period=0x4e20,event=0xc2,umask=0x40/uk \ > --clockid=monotonic_raw -- ./matrix.(icc|gcc) hum, so I guess the results suck because of the -a option, getting extra samples for all the perf record threads could you try without the -a? you monitor only user events, so you're interested only in ./matrix.* samples, right? thanks, jirka