On Mon, Mar 24, 2025, Mingwei Zhang wrote: > static void kvm_pmu_incr_counter(struct kvm_pmc *pmc) > { > - pmc->emulated_counter++; > - kvm_pmu_request_counter_reprogram(pmc); > + struct kvm_vcpu *vcpu = pmc->vcpu; > + > + /* > + * For perf-based PMUs, accumulate software-emulated events separately > + * from pmc->counter, as pmc->counter is offset by the count of the > + * associated perf event. Request reprogramming, which will consult > + * both emulated and hardware-generated events to detect overflow. > + */ > + if (!kvm_mediated_pmu_enabled(vcpu)) { > + pmc->emulated_counter++; > + kvm_pmu_request_counter_reprogram(pmc); > + return; > + } > + > + /* > + * For mediated PMUs, pmc->counter is updated when the vCPU's PMU is > + * put, and will be loaded into hardware when the PMU is loaded. Simply > + * increment the counter and signal overflow if it wraps to zero. > + */ > + pmc->counter = (pmc->counter + 1) & pmc_bitmask(pmc); > + if (!pmc->counter) {
Ugh, this is broken for the fastpath. If kvm_skip_emulated_instruction() is invoked by handle_fastpath_set_msr_irqoff() or handle_fastpath_hlt(), KVM may consume stale information (GLOBAL_CTRL, GLOBAL_STATUS and PMCs), and even if KVM gets lucky and those are all fresh, the PMC and GLOBAL_STATUS changes won't be propagated back to hardware. The best idea I have is to track whether or not the guest may be counting branches and/or instruction based on eventsels, and then bail from fastpaths that need to skip instructions. That flag would also be useful to further optimize kvm_pmu_trigger_event(). > + pmc_to_pmu(pmc)->global_status |= BIT_ULL(pmc->idx); > + if (pmc_pmi_enabled(pmc)) > + kvm_make_request(KVM_REQ_PMI, vcpu); > + } > } > > static inline bool cpl_is_matched(struct kvm_pmc *pmc) > -- > 2.49.0.395.g12beb8f557-goog >