On 9/30/2025 11:09 PM, Sean Christopherson wrote: > On Tue, Sep 30, 2025, Dapeng Mi wrote: >> On 9/30/2025 1:30 PM, Borah, Chaitanya Kumar wrote: >>> Hello Sean, >>> >>> Hope you are doing well. I am Chaitanya from the linux graphics team in >>> Intel. >>> >>> This mail is regarding a regression we are seeing in our CI runs[1] on >>> linux-next repository. >>> >>> Since the version next-20250919 [2], we are seeing the following regression >>> >>> ````````````````````````````````````````````````````````````````````````````````` >>> <4>[ 10.973827] ------------[ cut here ]------------ >>> <4>[ 10.973841] WARNING: arch/x86/events/core.c:3089 at >>> perf_get_x86_pmu_capability+0xd/0xc0, CPU#15: (udev-worker)/386 >>> ... >>> <4>[ 10.974028] Call Trace: >>> <4>[ 10.974030] <TASK> >>> <4>[ 10.974033] ? kvm_init_pmu_capability+0x2b/0x190 [kvm] >>> <4>[ 10.974154] kvm_x86_vendor_init+0x1b0/0x1a40 [kvm] >>> <4>[ 10.974248] vmx_init+0xdb/0x260 [kvm_intel] >>> <4>[ 10.974278] ? __pfx_vt_init+0x10/0x10 [kvm_intel] >>> <4>[ 10.974296] vt_init+0x12/0x9d0 [kvm_intel] >>> <4>[ 10.974309] ? __pfx_vt_init+0x10/0x10 [kvm_intel] >>> <4>[ 10.974322] do_one_initcall+0x60/0x3f0 >>> <4>[ 10.974335] do_init_module+0x97/0x2b0 >>> <4>[ 10.974345] load_module+0x2d08/0x2e30 >>> <4>[ 10.974349] ? __kernel_read+0x158/0x2f0 >>> <4>[ 10.974370] ? kernel_read_file+0x2b1/0x320 >>> <4>[ 10.974381] init_module_from_file+0x96/0xe0 >>> <4>[ 10.974384] ? init_module_from_file+0x96/0xe0 >>> <4>[ 10.974399] idempotent_init_module+0x117/0x330 >>> <4>[ 10.974415] __x64_sys_finit_module+0x73/0xe0 >>> ... >>> ````````````````````````````````````````````````````````````````````````````````` >>> Details log can be found in [3]. >>> >>> After bisecting the tree, the following patch [4] seems to be the first >>> "bad" commit >>> >>> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >>> From 51f34b1e650fc5843530266cea4341750bd1ae37 Mon Sep 17 00:00:00 2001 >>> >>> From: Sean Christopherson <[email protected]> >>> >>> Date: Wed, 6 Aug 2025 12:56:39 -0700 >>> >>> Subject: KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities >>> >>> Take a snapshot of the unadulterated PMU capabilities provided by perf so >>> that KVM can compare guest vPMU capabilities against hardware capabilities >>> when determining whether or not to intercept PMU MSRs (and RDPMC). >>> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >>> >>> We also verified that if we revert the patch the issue is not seen. >>> >>> Could you please check why the patch causes this regression and provide >>> a fix if necessary? >> Hi Chaitanya, >> >> I suppose you found this warning on a hybrid client platform, right? It >> looks the warning is triggered by the below WARN_ON_ONCE() in >> perf_get_x86_pmu_capability() function. >> >> if (WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) || >> !x86_pmu_initialized()) { >> memset(cap, 0, sizeof(*cap)); >> return; >> } >> >> The below change should fix it (just building, not test it). I would run a >> full scope vPMU test after I come back from China national day's holiday. > I have access to a hybrid system, I'll also double check there (though I'm > 99.9% > certain you've got it right). > >> Thanks. >> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >> index cebce7094de8..6d87c25226d8 100644 >> --- a/arch/x86/kvm/pmu.c >> +++ b/arch/x86/kvm/pmu.c >> @@ -108,8 +108,6 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) >> bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; >> int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; >> >> - perf_get_x86_pmu_capability(&kvm_host_pmu); >> - >> /* >> * Hybrid PMUs don't play nice with virtualization without careful >> * configuration by userspace, and KVM's APIs for reporting supported >> @@ -120,6 +118,8 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) >> enable_pmu = false; >> >> if (enable_pmu) { >> + perf_get_x86_pmu_capability(&kvm_host_pmu); >> + >> /* >> * WARN if perf did NOT disable hardware PMU if the number of >> * architecturally required GP counters aren't present, i.e. >> if > If we go this route, then the !enable_pmu path should explicitly zero > kvm_host_pmu > so that the behavior is consistent userspace loads kvm.ko with enable_pmu=0, > versus enable_pmu being cleared because of lack of support. > > if (!enable_pmu) { > memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); > return; > } > > The alternative would be keep kvm_host_pmu valid at all times for !HYBRID, > which > is what I intended with the bad patch, but that too would lead to inconsistent > behavior. So I think it makes sense to go with Dapeng's approach; we can > always > revisit this if some future thing in KVM _needs_ kvm_host_pmu even with > enable_pmu=0. > > if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { > enable_pmu = false; > memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > } else { > perf_get_x86_pmu_capability(&kvm_host_pmu); > }
Yeah, it looks better. We should decouple "enable_pmu" and "kvm_host_pmu" as the initial design. Thanks. >
