Hi Zide,
On 1/2/26 4:27 PM, Chen, Zide wrote:
>
>
> On 12/29/2025 11:42 PM, Dongli Zhang wrote:
>> PMU MSRs are set by QEMU only at levels >= KVM_PUT_RESET_STATE,
>> excluding runtime. Therefore, updating these MSRs without stopping events
>> should be acceptable.
>
> It seems preferable to keep the existing logic. The sequence of
> disabling -> setting new counters -> re-enabling is complete and
> reasonable. Re-enabling the PMU implicitly tell KVM to do whatever
> actions are needed to make the new counters take effect.
>
> If the purpose of this patch to improve performance, given that this is
> a non-critical path, trading this clear and robust logic for a minor
> performance gain does not seem necessary.
>
>
>> In addition, KVM creates kernel perf events with host mode excluded
>> (exclude_host = 1). While the events remain active, they don't increment
>> the counter during QEMU vCPU userspace mode.
>>
>> Finally, The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM
>> processes these MSRs one by one in a loop, only saving the config and
>> triggering the KVM_REQ_PMU request. This approach does not immediately stop
>> the event before updating PMC. This approach is true since Linux kernel
>> commit 68fb4757e867 ("KVM: x86/pmu: Defer reprogram_counter() to
>> kvm_pmu_handle_event"), that is, v6.2.
>
> This seems to assume KVM's internal behavior. While that is true today
> (and possibly in the future), it’s not necessary for QEMU to make such
> assumptions, as that could unnecessarily limit KVM’s flexibility to
> change its behavior later.
>
To "assume KVM's internal behavior" is only one of the two reasons. The first
reason is that QEMU controls the state of the vCPU to ensure this action only
occurs when "levels >= KVM_PUT_RESET_STATE."
Thanks to "(level >= KVM_PUT_RESET_STATE)", QEMU is able to avoid unnecessary
updates to many MSR registers during runtime.
The main objective is to sync the implementation for Intel and AMD.
Both MSR_CORE_PERF_FIXED_CTR_CTRL and MSR_CORE_PERF_GLOBAL_CTRL are reset to
zero only in the case where "has_pmu_version > 1." Otherwise, we may need to
reset the MSR_P6_PERFCTR_N registers before writing to the counter registers.
Without PATCH 7/7, an additional patch will be required to fix the workflow for
handling PMU registers to reset control registers before counter registers.
If the plan is to maintain the current logic, we may need to adjust the logic
for the AMD registers as well. In PATCH 6/7, we never reset global registers
before writing to control and counter registers.
Would you mine sharing your thoughts on it?
Thank you very much!
Dongli Zhang