On Sun, Aug 14, 2022 at 5:02 PM Alistair Francis <alistai...@gmail.com>
wrote:

> On Fri, Aug 12, 2022 at 12:05 PM Atish Patra <ati...@atishpatra.org>
> wrote:
> >
> > On Tue, Aug 2, 2022 at 4:33 PM Atish Patra <ati...@rivosinc.com> wrote:
> > >
> > > The latest version of the SBI specification includes a Performance
> Monitoring
> > > Unit(PMU) extension[1] which allows the supervisor to
> start/stop/configure
> > > various PMU events. The Sscofpmf ('Ss' for Privileged arch and
> Supervisor-level
> > > extensions, and 'cofpmf' for Count OverFlow and Privilege Mode
> Filtering)
> > > extension[2] allows the perf like tool to handle overflow interrupts
> and
> > > filtering support.
> > >
> > > This series implements remaining PMU infrastructure to support
> > > PMU in virt machine. The first seven patches from the original series
> > > have been merged already.
> > >
> > > This will allow us to add any PMU events in future.
> > > Currently, this series enables the following omu events.
> > > 1. cycle count
> > > 2. instruction count
> > > 3. DTLB load/store miss
> > > 4. ITLB prefetch miss
> > >
> > > The first two are computed using host ticks while last three are
> counted during
> > > cpu_tlb_fill. We can do both sampling and count from guest userspace.
> > > This series has been tested on both RV64 and RV32. Both Linux[3] and
> Opensbi[4]
> > > patches are required to get the perf working.
> > >
> > > Here is an output of perf stat/report while running hackbench with
> latest
> > > OpenSBI & Linux kernel.
> > >
> > > Perf stat:
> > > ==========
> > > [root@fedora-riscv ~]# perf stat -e cycles -e instructions -e
> dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses \
> > > > perf bench sched messaging -g 1 -l 10
> > > # Running 'sched/messaging' benchmark:
> > > # 20 sender and receiver processes per group
> > > # 1 groups == 40 processes run
> > >
> > >      Total time: 0.265 [sec]
> > >
> > >  Performance counter stats for 'perf bench sched messaging -g 1 -l 10':
> > >
> > >      4,167,825,362      cycles
> > >      4,166,609,256      instructions              #    1.00  insn per
> cycle
> > >          3,092,026      dTLB-load-misses
> > >            258,280      dTLB-store-misses
> > >          2,068,966      iTLB-load-misses
> > >
> > >        0.585791767 seconds time elapsed
> > >
> > >        0.373802000 seconds user
> > >        1.042359000 seconds sys
> > >
> > > Perf record:
> > > ============
> > > [root@fedora-riscv ~]# perf record -e cycles -e instructions \
> > > > -e dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses -c
> 10000 \
> > > > perf bench sched messaging -g 1 -l 10
> > > # Running 'sched/messaging' benchmark:
> > > # 20 sender and receiver processes per group
> > > # 1 groups == 40 processes run
> > >
> > >      Total time: 1.397 [sec]
> > > [ perf record: Woken up 10 times to write data ]
> > > Check IO/CPU overload!
> > > [ perf record: Captured and wrote 8.211 MB perf.data (214486 samples) ]
> > >
> > > [root@fedora-riscv riscv]# perf report
> > > Available samples
> > > 107K cycles
>         ◆
> > > 107K instructions
>         ▒
> > > 250 dTLB-load-misses
>          ▒
> > > 13 dTLB-store-misses
>          ▒
> > > 172 iTLB-load-misses
> > > ..
> > >
> > > Changes from v11->v12:
> > > 1. Rebased on top of the apply-next.
> > > 2. Aligned the write function & .min_priv to the previous line.
> > > 3. Fixed the FDT generations for multi-socket scenario.
> > > 4. Dropped interrupt property from the DT.
> > > 5. Generate illegal instruction fault instead of virtual instruction
> fault
> > >    for VS/VU access while mcounteren is not set.
> > >
> > > Changes from v10->v11:
> > > 1. Rebased on top of the master where first 7 patches were already
> merged.
> > > 2. Removed unnecessary additional check in ctr predicate function.
> > > 3. Removed unnecessary priv version checks in mcountinhibit read/write.
> > > 4. Added Heiko's reviewed-by/tested-by tags.
> > >
> > > Changes from v8->v9:
> > > 1. Added the write_done flags to the vmstate.
> > > 2. Fixed the hpmcounter read access from M-mode.
> > >
> > > Changes from v7->v8:
> > > 1. Removeding ordering constraints for mhpmcounter & mhpmevent.
> > >
> > > Changes from v6->v7:
> > > 1. Fixed all the compilation errors for the usermode.
> > >
> > > Changes from v5->v6:
> > > 1. Fixed compilation issue with PATCH 1.
> > > 2. Addressed other comments.
> > >
> > > Changes from v4->v5:
> > > 1. Rebased on top of the -next with following patches.
> > >    - isa extension
> > >    - priv 1.12 spec
> > > 2. Addressed all the comments on v4
> > > 3. Removed additional isa-ext DT node in favor of riscv,isa string
> update
> > >
> > > Changes from v3->v4:
> > > 1. Removed the dummy events from pmu DT node.
> > > 2. Fixed pmu_avail_counters mask generation.
> > > 3. Added a patch to simplify the predicate function for counters.
> > >
> > > Changes from v2->v3:
> > > 1. Addressed all the comments on PATCH1-4.
> > > 2. Split patch1 into two separate patches.
> > > 3. Added explicit comments to explain the event types in DT node.
> > > 4. Rebased on latest Qemu.
> > >
> > > Changes from v1->v2:
> > > 1. Dropped the ACks from v1 as signficant changes happened after v1.
> > > 2. sscofpmf support.
> > > 3. A generic counter management framework.
> > >
> > > [1]
> https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc
> > > [2]
> https://drive.google.com/file/d/171j4jFjIkKdj5LWcExphq4xG_2sihbfd/edit
> > > [3] https://github.com/atishp04/qemu/tree/riscv_pmu_v12
> > >
> > > Atish Patra (6):
> > > target/riscv: Add sscofpmf extension support
> > > target/riscv: Simplify counter predicate function
> > > target/riscv: Add few cache related PMU events
> > > hw/riscv: virt: Add PMU DT node to the device tree
> > > target/riscv: Update the privilege field for sscofpmf CSRs
> > > target/riscv: Remove additional priv version check for mcountinhibit
> > >
> > > hw/riscv/virt.c           |  16 ++
> > > target/riscv/cpu.c        |  12 ++
> > > target/riscv/cpu.h        |  25 +++
> > > target/riscv/cpu_bits.h   |  55 +++++
> > > target/riscv/cpu_helper.c |  25 +++
> > > target/riscv/csr.c        | 312 +++++++++++++++++-----------
> > > target/riscv/machine.c    |   1 +
> > > target/riscv/pmu.c        | 414 +++++++++++++++++++++++++++++++++++++-
> > > target/riscv/pmu.h        |   8 +
> > > 9 files changed, 749 insertions(+), 119 deletions(-)
> > >
> > > --
> > > 2.25.1
> > >
> > >
> >
> > Any other comments on this series ?
>
> Sooo....
>
> The series looks good, the patches are all reviewed, but do you mind
> rebasing this on
> https://github.com/alistair23/qemu/tree/riscv-to-apply.next ? Sorry
> about the hassle
>
>
Thanks for the review. Rebased and sent v13.


> Alistair
>
> >
> >
> > --
> > Regards,
> > Atish
> >
>

Reply via email to