From: Kan Liang <kan.li...@linux.intel.com> ------
Changes since V2: - Refined the changelog - Introduced specific read function for large PEBS. The previous generic PEBS read function is confusing. Disabled PMU in pmu::read() path for large PEBS. Handled the corner case when reload_times == 0. - Modified the parameter of intel_pmu_save_and_restart_reload() Discarded local64_cmpxchg - Added fixes tag - Added WARN to handle reload_times == 0 || reload_val == 0 Changes since V1: - Check PERF_X86_EVENT_AUTO_RELOAD before call intel_pmu_save_and_restore() - Introduce a special purpose intel_pmu_save_and_restart() just for AUTO_RELOAD. - New patch to disable userspace RDPMC usage if large PEBS is enabled. ------ There is a bug when mmap read event->count with large PEBS enabled. Here is an example. #./read_count 0x71f0 0x122c0 0x1000000001c54 0x100000001257d 0x200000000bdc5 The bug is caused by two issues. - In x86_perf_event_update, the calculation of event->count does not take the auto-reload values into account. - In x86_pmu_read, it doesn't count the undrained values in large PEBS buffers. The first issue was introduced with the auto-reload mechanism enabled since commit 851559e35fd5 ("perf/x86/intel: Use the PEBS auto reload mechanism when possible") Patch 1 fixed the issue in x86_perf_event_update. The second issue was introduced since commit b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold)") Patch 2-4 fixed the issue in x86_pmu_read. Besides the two issues, the userspace RDPMC usage is broken for large PEBS as well. The RDPMC issue was also introduced since commit b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold)") Patch 5 fixed the RDPMC issue. The source code of read_count is as below. struct cpu { int fd; struct perf_event_mmap_page *buf; }; int perf_open(struct cpu *ctx, int cpu) { struct perf_event_attr attr = { .type = PERF_TYPE_HARDWARE, .size = sizeof(struct perf_event_attr), .sample_period = 100000, .config = 0, .sample_type = PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_TIME | PERF_SAMPLE_CPU, .precise_ip = 3, .mmap = 1, .comm = 1, .task = 1, .mmap2 = 1, .sample_id_all = 1, .comm_exec = 1, }; ctx->buf = NULL; ctx->fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0); if (ctx->fd < 0) { perror("perf_event_open"); return -1; } return 0; } void perf_close(struct cpu *ctx) { close(ctx->fd); if (ctx->buf) munmap(ctx->buf, pagesize); } int main(int ac, char **av) { struct cpu ctx; u64 count; perf_open(&ctx, 0); while (1) { sleep(5); if (read(ctx.fd, &count, 8) != 8) { perror("counter read"); break; } printf("0x%llx\n", count); } perf_close(&ctx); } Kan Liang (5): perf/x86/intel: fix event update for auto-reload perf/x86: introduce read function for x86_pmu perf/x86/intel/ds: introduce read function for large pebs perf/x86/intel: introduce read function for intel_pmu perf/x86: fix: disable userspace RDPMC usage for large PEBS arch/x86/events/core.c | 5 ++- arch/x86/events/intel/core.c | 9 +++++ arch/x86/events/intel/ds.c | 85 ++++++++++++++++++++++++++++++++++++++++++-- arch/x86/events/perf_event.h | 3 ++ 4 files changed, 99 insertions(+), 3 deletions(-) -- 2.7.4