Re: REGRESSION on linux-next (next-20250919)

Borah, Chaitanya Kumar Sat, 18 Oct 2025 12:27:14 -0700



On 9/30/2025 8:39 PM, Sean Christopherson wrote:

On Tue, Sep 30, 2025, Dapeng Mi wrote:


On 9/30/2025 1:30 PM, Borah, Chaitanya Kumar wrote:

Hello Sean,

Hope you are doing well. I am Chaitanya from the linux graphics team in
Intel.

This mail is regarding a regression we are seeing in our CI runs[1] on
linux-next repository.

Since the version next-20250919 [2], we are seeing the following regression

`````````````````````````````````````````````````````````````````````````````````
<4>[   10.973827] ------------[ cut here ]------------
<4>[   10.973841] WARNING: arch/x86/events/core.c:3089 at
perf_get_x86_pmu_capability+0xd/0xc0, CPU#15: (udev-worker)/386
...
<4>[   10.974028] Call Trace:
<4>[   10.974030]  <TASK>
<4>[   10.974033]  ? kvm_init_pmu_capability+0x2b/0x190 [kvm]
<4>[   10.974154]  kvm_x86_vendor_init+0x1b0/0x1a40 [kvm]
<4>[   10.974248]  vmx_init+0xdb/0x260 [kvm_intel]
<4>[   10.974278]  ? __pfx_vt_init+0x10/0x10 [kvm_intel]
<4>[   10.974296]  vt_init+0x12/0x9d0 [kvm_intel]
<4>[   10.974309]  ? __pfx_vt_init+0x10/0x10 [kvm_intel]
<4>[   10.974322]  do_one_initcall+0x60/0x3f0
<4>[   10.974335]  do_init_module+0x97/0x2b0
<4>[   10.974345]  load_module+0x2d08/0x2e30
<4>[   10.974349]  ? __kernel_read+0x158/0x2f0
<4>[   10.974370]  ? kernel_read_file+0x2b1/0x320
<4>[   10.974381]  init_module_from_file+0x96/0xe0
<4>[   10.974384]  ? init_module_from_file+0x96/0xe0
<4>[   10.974399]  idempotent_init_module+0x117/0x330
<4>[   10.974415]  __x64_sys_finit_module+0x73/0xe0
...
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].

After bisecting the tree, the following patch [4] seems to be the first
"bad" commit

`````````````````````````````````````````````````````````````````````````````````````````````````````````
  From 51f34b1e650fc5843530266cea4341750bd1ae37 Mon Sep 17 00:00:00 2001

From: Sean Christopherson <[email protected]>

Date: Wed, 6 Aug 2025 12:56:39 -0700

Subject: KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities

Take a snapshot of the unadulterated PMU capabilities provided by perf so
that KVM can compare guest vPMU capabilities against hardware capabilities
when determining whether or not to intercept PMU MSRs (and RDPMC).
`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that if we revert the patch the issue is not seen.

Could you please check why the patch causes this regression and provide
a fix if necessary?


Hi Chaitanya,

I suppose you found this warning on a hybrid client platform, right? It
looks the warning is triggered by the below WARN_ON_ONCE() in
perf_get_x86_pmu_capability() function.

   if (WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) ||
         !x86_pmu_initialized()) {
         memset(cap, 0, sizeof(*cap));
         return;
     }

The below change should fix it (just building, not test it). I would run a
full scope vPMU test after I come back from China national day's holiday.


I have access to a hybrid system, I'll also double check there (though I'm 99.9%
certain you've got it right).

Thanks.

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index cebce7094de8..6d87c25226d8 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -108,8 +108,6 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops)
         bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL;
         int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS;

-       perf_get_x86_pmu_capability(&kvm_host_pmu);
-
         /*
          * Hybrid PMUs don't play nice with virtualization without careful
          * configuration by userspace, and KVM's APIs for reporting supported
@@ -120,6 +118,8 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops)
                 enable_pmu = false;

         if (enable_pmu) {
+               perf_get_x86_pmu_capability(&kvm_host_pmu);
+
                 /*
                  * WARN if perf did NOT disable hardware PMU if the number of
                  * architecturally required GP counters aren't present, i.e. if


If we go this route, then the !enable_pmu path should explicitly zero 
kvm_host_pmu
so that the behavior is consistent userspace loads kvm.ko with enable_pmu=0,
versus enable_pmu being cleared because of lack of support.

        if (!enable_pmu) {
                memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu));
                memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap));
                return;
        }

The alternative would be keep kvm_host_pmu valid at all times for !HYBRID, which
is what I intended with the bad patch, but that too would lead to inconsistent
behavior.  So I think it makes sense to go with Dapeng's approach; we can always
revisit this if some future thing in KVM _needs_ kvm_host_pmu even with 
enable_pmu=0.

        if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) {
                enable_pmu = false;
                memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu));
        } else {
                perf_get_x86_pmu_capability(&kvm_host_pmu);
        }



Thank you for your responses.

Following change fixes the issue for us.


diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 40ac4cb44ed2..487ad19a236e 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c

@@ -108,16 +108,18 @@ void kvm_init_pmu_capability(const structkvm_pmu_ops *pmu_ops)

        bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL;
        int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS;

-       perf_get_x86_pmu_capability(&kvm_host_pmu);
-
        /*
         * Hybrid PMUs don't play nice with virtualization without careful

* configuration by userspace, and KVM's APIs for reportingsupported* vPMU features do not account for hybrid PMUs. Disable vPMUsupport

         * for hybrid PMUs until KVM gains a way to let userspace opt-in.
         */
-       if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU))
+       if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) {
                enable_pmu = false;
+               memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu));
+       } else {
+               perf_get_x86_pmu_capability(&kvm_host_pmu);
+       }

Regards

Chaitanya

Re: REGRESSION on linux-next (next-20250919)

Reply via email to