Oliver Upton <[email protected]> writes:
On Tue, Dec 09, 2025 at 08:51:14PM +0000, Colton Lewis wrote:
+/**
+ * kvm_pmu_load() - Load untrapped PMU registers
+ * @vcpu: Pointer to struct kvm_vcpu
+ *
+ * Load all untrapped PMU registers from the VCPU into the PCPU. Mask
+ * to only bits belonging to guest-reserved counters and leave
+ * host-reserved counters alone in bitmask registers.
+ */
+void kvm_pmu_load(struct kvm_vcpu *vcpu)
+{
+ struct arm_pmu *pmu;
+ u64 mask;
+ u8 i;
+ u64 val;
+
Assert that preemption is disabled.
Will do.
+ /*
+ * If we aren't using FGT then we are trapping everything
+ * anyway, so no need to bother with the swap.
+ */
+ if (!kvm_vcpu_pmu_use_fgt(vcpu))
+ return;
Uhh... Then how do events count in this case?
The absence of FEAT_FGT shouldn't affect the residence of the guest PMU
context. We just need to handle the extra traps, ideally by reading the
PMU registers directly as a fast path exit handler.
Agreed. Yeah I fixed this in my internal backports but looks like I
skipped incorperating the fix here.
+ pmu = vcpu->kvm->arch.arm_pmu;
+
+ for (i = 0; i < pmu->hpmn_max; i++) {
+ val = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i);
+ write_pmevcntrn(i, val);
+ }
+
+ val = __vcpu_sys_reg(vcpu, PMCCNTR_EL0);
+ write_pmccntr(val);
+
+ val = __vcpu_sys_reg(vcpu, PMUSERENR_EL0);
+ write_pmuserenr(val);
What about the host's value for PMUSERENR?
+ val = __vcpu_sys_reg(vcpu, PMSELR_EL0);
+ write_pmselr(val);
PMSELR_EL0 needs to be switched late, e.g. at
sysreg_restore_guest_state_vhe().
Even though the host doesn't currently use the selector-based accessor,
I'd prefer we not load things that'd affect the host context until we're
about to enter the guest.
There's a spot in __activate_traps_common() where the host value for
PMUSERENR is saved and PMSELR is zeroed. I stopped that branch when
partitioning because it was clobbering my loaded values, but I can
modify instead to handle these things as they should be handled.
+ /* Save only the stateful writable bits. */
+ val = __vcpu_sys_reg(vcpu, PMCR_EL0);
+ mask = ARMV8_PMU_PMCR_MASK &
+ ~(ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C);
+ write_pmcr(val & mask);
+
+ /*
+ * When handling these:
+ * 1. Apply only the bits for guest counters (indicated by mask)
+ * 2. Use the different registers for set and clear
+ */
+ mask = kvm_pmu_guest_counter_mask(pmu);
+
+ val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
+ write_pmcntenset(val & mask);
+ write_pmcntenclr(~val & mask);
+
+ val = __vcpu_sys_reg(vcpu, PMINTENSET_EL1);
+ write_pmintenset(val & mask);
+ write_pmintenclr(~val & mask);
Is this safe? What happens if we put the PMU into an overflow condition?
It gets handled by the host same as any other PMU interrupt. Though I
remember from our conversation you don't want the latency of an
additional interrupt so I can handle that here.
+}
+
+/**
+ * kvm_pmu_put() - Put untrapped PMU registers
+ * @vcpu: Pointer to struct kvm_vcpu
+ *
+ * Put all untrapped PMU registers from the VCPU into the PCPU. Mask
+ * to only bits belonging to guest-reserved counters and leave
+ * host-reserved counters alone in bitmask registers.
+ */
+void kvm_pmu_put(struct kvm_vcpu *vcpu)
+{
+ struct arm_pmu *pmu;
+ u64 mask;
+ u8 i;
+ u64 val;
+
+ /*
+ * If we aren't using FGT then we are trapping everything
+ * anyway, so no need to bother with the swap.
+ */
+ if (!kvm_vcpu_pmu_use_fgt(vcpu))
+ return;
+
+ pmu = vcpu->kvm->arch.arm_pmu;
+
+ for (i = 0; i < pmu->hpmn_max; i++) {
+ val = read_pmevcntrn(i);
+ __vcpu_assign_sys_reg(vcpu, PMEVCNTR0_EL0 + i, val);
+ }
+
+ val = read_pmccntr();
+ __vcpu_assign_sys_reg(vcpu, PMCCNTR_EL0, val);
+
+ val = read_pmuserenr();
+ __vcpu_assign_sys_reg(vcpu, PMUSERENR_EL0, val);
+
+ val = read_pmselr();
+ __vcpu_assign_sys_reg(vcpu, PMSELR_EL0, val);
+
+ val = read_pmcr();
+ __vcpu_assign_sys_reg(vcpu, PMCR_EL0, val);
+
+ /* Mask these to only save the guest relevant bits. */
+ mask = kvm_pmu_guest_counter_mask(pmu);
+
+ val = read_pmcntenset();
+ __vcpu_assign_sys_reg(vcpu, PMCNTENSET_EL0, val & mask);
+
+ val = read_pmintenset();
+ __vcpu_assign_sys_reg(vcpu, PMINTENSET_EL1, val & mask);
What if the PMU is in an overflow state at this point?
Is this a separate concern from the point above? It gets loaded back
that way and the normal interrupt machinery handles it.