On Tue, Jul 15, 2014 at 04:58:56PM +0800, Yan, Zheng wrote: > PEBS always had the capability to log samples to its buffers without > an interrupt. Traditionally perf has not used this but always set the > PEBS threshold to one. > > For frequently occuring events (like cycles or branches or load/stores) > this in term requires using a relatively high sampling period to avoid > overloading the system, by only processing PMIs. This in term increases > sampling error. > > For the common cases we still need to use the PMI because the PEBS > hardware has various limitations. The biggest one is that it can not > supply a callgraph. It also requires setting a fixed period, as the > hardware does not support adaptive period. Another issue is that it > cannot supply a time stamp and some other options. To supply a TID it > requires flushing on context switch. It can however supply the IP, the > load/store address, TSX information, registers, and some other things. > > So we can make PEBS work for some specific cases, basically as long as > you can do without a callgraph and can set the period you can use this > new PEBS mode. > > The main benefit is the ability to support much lower sampling period > (down to -c 1000) without extensive overhead. > > One use cases is for example to increase the resolution of the c2c tool. > Another is double checking when you suspect the standard sampling has > too much sampling error. > > Some numbers on the overhead, using cycle soak, comparing > "perf record --no-time -e cycles:p -c" to "perf record -e cycles:p -c" > > period plain multi delta > 10003 15 5 10 > 20003 15.7 4 11.7 > 40003 8.7 2.5 6.2 > 80003 4.1 1.4 2.7 > 100003 3.6 1.2 2.4 > 800003 4.4 1.4 3 > 1000003 0.6 0.4 0.2 > 2000003 0.4 0.3 0.1 > 4000003 0.3 0.2 0.1 > 10000003 0.3 0.2 0.1 > > The interesting part is the delta between multi-pebs and normal pebs. Above > -c 1000003 it does not really matter because the basic overhead is so low. > With periods below 80003 it becomes interesting. > > Note in some other workloads (e.g. kernbench) the smaller sampling periods > cause much more overhead without multi-pebs, upto 80% (and throttling) have > been observed with -c 10003. multi pebs generally does not throttle. >
And not a single word on the multiplex horror we talked about. That should be mentioned, in detail.
pgpufc_WVZ2S3.pgp
Description: PGP signature