Oliver Upton <[email protected]> writes:

On Tue, Dec 09, 2025 at 08:51:14PM +0000, Colton Lewis wrote:
+/**
+ * kvm_pmu_load() - Load untrapped PMU registers
+ * @vcpu: Pointer to struct kvm_vcpu
+ *
+ * Load all untrapped PMU registers from the VCPU into the PCPU. Mask
+ * to only bits belonging to guest-reserved counters and leave
+ * host-reserved counters alone in bitmask registers.
+ */
+void kvm_pmu_load(struct kvm_vcpu *vcpu)
+{
+       struct arm_pmu *pmu;
+       u64 mask;
+       u8 i;
+       u64 val;
+

Assert that preemption is disabled.

Will do.

+       /*
+        * If we aren't using FGT then we are trapping everything
+        * anyway, so no need to bother with the swap.
+        */
+       if (!kvm_vcpu_pmu_use_fgt(vcpu))
+               return;

Uhh... Then how do events count in this case?

The absence of FEAT_FGT shouldn't affect the residence of the guest PMU
context. We just need to handle the extra traps, ideally by reading the
PMU registers directly as a fast path exit handler.

Agreed. Yeah I fixed this in my internal backports but looks like I
skipped incorperating the fix here.

+       pmu = vcpu->kvm->arch.arm_pmu;
+
+       for (i = 0; i < pmu->hpmn_max; i++) {
+               val = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i);
+               write_pmevcntrn(i, val);
+       }
+
+       val = __vcpu_sys_reg(vcpu, PMCCNTR_EL0);
+       write_pmccntr(val);
+
+       val = __vcpu_sys_reg(vcpu, PMUSERENR_EL0);
+       write_pmuserenr(val);

What about the host's value for PMUSERENR?
+       val = __vcpu_sys_reg(vcpu, PMSELR_EL0);
+       write_pmselr(val);

PMSELR_EL0 needs to be switched late, e.g. at sysreg_restore_guest_state_vhe().
Even though the host doesn't currently use the selector-based accessor,
I'd prefer we not load things that'd affect the host context until we're
about to enter the guest.


There's a spot in __activate_traps_common() where the host value for
PMUSERENR is saved and PMSELR is zeroed. I stopped that branch when
partitioning because it was clobbering my loaded values, but I can
modify instead to handle these things as they should be handled.

+       /* Save only the stateful writable bits. */
+       val = __vcpu_sys_reg(vcpu, PMCR_EL0);
+       mask = ARMV8_PMU_PMCR_MASK &
+               ~(ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C);
+       write_pmcr(val & mask);
+
+       /*
+        * When handling these:
+        * 1. Apply only the bits for guest counters (indicated by mask)
+        * 2. Use the different registers for set and clear
+        */
+       mask = kvm_pmu_guest_counter_mask(pmu);
+
+       val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
+       write_pmcntenset(val & mask);
+       write_pmcntenclr(~val & mask);
+
+       val = __vcpu_sys_reg(vcpu, PMINTENSET_EL1);
+       write_pmintenset(val & mask);
+       write_pmintenclr(~val & mask);

Is this safe? What happens if we put the PMU into an overflow condition?

It gets handled by the host same as any other PMU interrupt. Though I
remember from our conversation you don't want the latency of an
additional interrupt so I can handle that here.

+}
+
+/**
+ * kvm_pmu_put() - Put untrapped PMU registers
+ * @vcpu: Pointer to struct kvm_vcpu
+ *
+ * Put all untrapped PMU registers from the VCPU into the PCPU. Mask
+ * to only bits belonging to guest-reserved counters and leave
+ * host-reserved counters alone in bitmask registers.
+ */
+void kvm_pmu_put(struct kvm_vcpu *vcpu)
+{
+       struct arm_pmu *pmu;
+       u64 mask;
+       u8 i;
+       u64 val;
+
+       /*
+        * If we aren't using FGT then we are trapping everything
+        * anyway, so no need to bother with the swap.
+        */
+       if (!kvm_vcpu_pmu_use_fgt(vcpu))
+               return;
+
+       pmu = vcpu->kvm->arch.arm_pmu;
+
+       for (i = 0; i < pmu->hpmn_max; i++) {
+               val = read_pmevcntrn(i);
+               __vcpu_assign_sys_reg(vcpu, PMEVCNTR0_EL0 + i, val);
+       }
+
+       val = read_pmccntr();
+       __vcpu_assign_sys_reg(vcpu, PMCCNTR_EL0, val);
+
+       val = read_pmuserenr();
+       __vcpu_assign_sys_reg(vcpu, PMUSERENR_EL0, val);
+
+       val = read_pmselr();
+       __vcpu_assign_sys_reg(vcpu, PMSELR_EL0, val);
+
+       val = read_pmcr();
+       __vcpu_assign_sys_reg(vcpu, PMCR_EL0, val);
+
+       /* Mask these to only save the guest relevant bits. */
+       mask = kvm_pmu_guest_counter_mask(pmu);
+
+       val = read_pmcntenset();
+       __vcpu_assign_sys_reg(vcpu, PMCNTENSET_EL0, val & mask);
+
+       val = read_pmintenset();
+       __vcpu_assign_sys_reg(vcpu, PMINTENSET_EL1, val & mask);

What if the PMU is in an overflow state at this point?

Is this a separate concern from the point above? It gets loaded back
that way and the normal interrupt machinery handles it.

Reply via email to