> Well, at least the front-end side is still documented in the SDM as being > usable to count stalled cycles.
Stalled frontend cycles does not necessarily mean frontend bound. The real bottleneck can be still somewhere later in the PipeLine. Out of Order CPUs are complex. > > AFAICS backend stall cycles are documented to work on Ivy Bridge. I'm not aware of any documentation that presents these events as accurate frontend/backend stalls without using the full TopDown methology (Optimization manual B.3.2) The level 1 top down method for IvyBridge and Haswell is: PipelineWidth = 4 Slots = PipelineWidth*CPU_CLK_UNHALTED FrontendBound = IDQ_UOPS_NOT_DELIVERED.CORE / Slots BadSpeculation = (UOPS_ISSUED.ANY - UOPS_RETIRED.RETIRE_SLOTS + Width*INT_MISC.RECOVERY_CYCLES) / Slots Retiring = UOPS_RETIRED.RETIRE_SLOTS / Slots BackendBound = FrontendBound - BadSpeculation + Retiring > For perf stat -a alike system-wide workloads it should still produce > usable results that way. For some classes of workloads it will be a large unpredictable systematic error. > I.e. something like the patch below (it does not solve the double counting > yet). Well you can add it, but I'm not going to Ack it. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/