From: Kan Liang <kan.li...@linux.intel.com>

------

Changes since V2:
 - Refined the changelog
 - Introduced specific read function for large PEBS.
   The previous generic PEBS read function is confusing.
   Disabled PMU in pmu::read() path for large PEBS.
   Handled the corner case when reload_times == 0.
 - Modified the parameter of intel_pmu_save_and_restart_reload()
   Discarded local64_cmpxchg
 - Added fixes tag
 - Added WARN to handle reload_times == 0 || reload_val == 0

Changes since V1:
 - Check PERF_X86_EVENT_AUTO_RELOAD before call
   intel_pmu_save_and_restore()
 - Introduce a special purpose intel_pmu_save_and_restart()
   just for AUTO_RELOAD.
 - New patch to disable userspace RDPMC usage if large PEBS is enabled.

------

There is a bug when mmap read event->count with large PEBS enabled.
Here is an example.
 #./read_count
 0x71f0
 0x122c0
 0x1000000001c54
 0x100000001257d
 0x200000000bdc5

The bug is caused by two issues.
- In x86_perf_event_update, the calculation of event->count does not
  take the auto-reload values into account.
- In x86_pmu_read, it doesn't count the undrained values in large PEBS
  buffers.

The first issue was introduced with the auto-reload mechanism enabled
since commit 851559e35fd5 ("perf/x86/intel: Use the PEBS auto reload
mechanism when possible")

Patch 1 fixed the issue in x86_perf_event_update.

The second issue was introduced since commit b8241d20699e
("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS
interrupt threshold)")

Patch 2-4 fixed the issue in x86_pmu_read.

Besides the two issues, the userspace RDPMC usage is broken for large
PEBS as well.
The RDPMC issue was also introduced since commit b8241d20699e
("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS
interrupt threshold)")

Patch 5 fixed the RDPMC issue.

The source code of read_count is as below.

struct cpu {
        int fd;
        struct perf_event_mmap_page *buf;
};

int perf_open(struct cpu *ctx, int cpu)
{
        struct perf_event_attr attr = {
                .type = PERF_TYPE_HARDWARE,
                .size = sizeof(struct perf_event_attr),
                .sample_period = 100000,
                .config = 0,
                .sample_type = PERF_SAMPLE_IP | PERF_SAMPLE_TID |
                                PERF_SAMPLE_TIME | PERF_SAMPLE_CPU,
                .precise_ip = 3,
                .mmap = 1,
                .comm = 1,
                .task = 1,
                .mmap2 = 1,
                .sample_id_all = 1,
                .comm_exec = 1,
        };
        ctx->buf = NULL;
        ctx->fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
        if (ctx->fd < 0) {
                perror("perf_event_open");
                return -1;
        }
        return 0;
}

void perf_close(struct cpu *ctx)
{
        close(ctx->fd);
        if (ctx->buf)
                munmap(ctx->buf, pagesize);
}

int main(int ac, char **av)
{
        struct cpu ctx;
        u64 count;

        perf_open(&ctx, 0);

        while (1) {
                sleep(5);

                if (read(ctx.fd, &count, 8) != 8) {
                        perror("counter read");
                        break;
                }
                printf("0x%llx\n", count);

        }
        perf_close(&ctx);
}

Kan Liang (5):
  perf/x86/intel: fix event update for auto-reload
  perf/x86: introduce read function for x86_pmu
  perf/x86/intel/ds: introduce read function for large pebs
  perf/x86/intel: introduce read function for intel_pmu
  perf/x86: fix: disable userspace RDPMC usage for large PEBS

 arch/x86/events/core.c       |  5 ++-
 arch/x86/events/intel/core.c |  9 +++++
 arch/x86/events/intel/ds.c   | 85 ++++++++++++++++++++++++++++++++++++++++++--
 arch/x86/events/perf_event.h |  3 ++
 4 files changed, 99 insertions(+), 3 deletions(-)

-- 
2.7.4

Reply via email to